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Several dark energy experiments are available from a single large-area imaging survey, and may 
be combined to improve cosmological parameter constraints and/or test inherent systematics. Two 
promising experiments are cosmic shear power spectra and counts of galaxy clusters. However the 
two experiments probe the same cosmic mass density held in large-scale structure, therefore the 
combination may be less powerful than Hrst thought. 

We investigate the cross-covariance between the cosmic shear power spectra and the cluster counts 
based on the halo model approach, where the cross-covariance arises from the three-point correlations 
of the underlying mass density held. Fully taking into account the cross-covariance as well as non- 
Gaussian errors on the lensing power spectrum covariance, we Hnd a signihcant cross-correlation 
between the lensing power spectrum signals at multipoles I ~ 10'^ and the cluster counts containing 
halos with masses M > 10" Mq. Including the cross-covariance for the combined measurement 
degrades and in some cases improves the total signal-to-noise ratios up to ^ ±20% relative to when 
the two are independent. For cosmological parameter determination, the cross-covariance has a 
smaller effect as a result of working in a multi-dimensional parameter space, implying that the two 
observables can be considered independent to a good approximation. We also discuss that cluster 
count experiments using lensing-selected mass peaks could be more complementary to cosmic shear 
tomography than mass-selected cluster counts of the corresponding mass threshold. Using lensing 
selected clusters with a realistic usable detection threshold ((S'/A'')ciustGr ~ 6 for a ground-based 
survey), the uncertainty on each dark energy parameter may be roughly halved by the combined 
experiments, relative to using the power spectra alone. 



I. INTRODUCTION 

In recent years great observational progress has been 
made in measuring the constituents of the universe (e.g. 
[Hi Hi Hi)- It appears that the universe is currently dom- 
inated by an unexpected component that is causing the 
universe to accelerate in its expansion. This component 
is dubbed "dark energy" . Understanding the nature of 
dark energy is one of most fundamental questions that re- 
main unresolved with the current cosmological data sets 
(e.g. [1, This is now the focus of several planned 
future surveys @, 0, 11 [S [O, El ■ 

Whether the accelerating expansion is as a conse- 
quence of the cosmological constant, a new fluid or a 
modification to Einstein's gravity, these future surveys 
will provide key information. In addition they will pro- 
vide a wealth of further cosmological information, such 
as constraints on the neutrino mass and the spectrum of 
primordial perturbations generated in the early universe 
(e.g. H). 

Combining several techniques accessible from different 
cosmological observables is often a powerful way to im- 
prove constraints on cosmology. However, care must be 
taken if the observables are not completely independent. 
Two of the most promising methods for constraining the 
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dark energy are galaxy cluster counts and cosmic shear 
(e.g. P). 

Clusters of galaxies contain galaxies, hot gas and dark 
matter in ratio approximately 1:10:100 ^15i ]. They are 
the largest gravitationally bound objects in the universe 
and the number of clusters of galaxies has long been 
recognized as a powerful probe of cosmology [l^, E3j 
El, El) m. Counting clusters of galaxies as a func- 
tion of redshift allows a combination of structure growth 
and geometrical information to be extracted, thus poten- 
tiall y a llowing constraints on the nature of dark energy 
pH . 12^ [23I . [24 . . If cluster masses can be measured 
accurately then the shape of the mass function also helps 
to break degeneracies [2g|. The distribution of clusters 
on the sky (e.g. two-point correlation function) carries 
additional information on dark energy [13, HI . 

The bending of light by mass, gravitational lensing, 
causes images of distant galaxies to be distorted. These 
sheared source galaxies are mostly too weakly distorted 
for us to measure the effect in single galaxies, but require 
surveys containing a few million galaxies to detect the 
signal in a statistical way. This cosmic shear signal has 
been observed [H, Is^, [^, "s?] and used to constrain cos- 
mology (most recently [33, 34, j3^, j36,] ) . By using redshift 
information of source galaxies the evolution of the dark 
matter distribution with redshift can be inferred. Hence, 
measuring the cosmic shear two-point function as a func- 
tion of redshift and separation between pairs of galaxies 
can be used to constrain the geometry of the universe as 
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well as the growth of mass clustering. This method has 
emerged as one of the most promising to obtain precise 
constraints on the nature of dark energy if systematics 
are well under control [13, H [111 . 

Future optical imaging surveys suitable for cosmic 
shear analysis will also allow the identification of clusters 
of galaxies. This could be done either using the colors 
of the cluster members (e.g. [10, |4l|) or using peaks in 
the gravitational lensing shear field (e.g. [i^. [43l.l44| ) . In 
addition cluster surveys in other wavebands will overlap 
with the cosmic shear surveys allowing detection using 
X-rays and the thermal Sunyaev-Zel'dovich (SZ) effect. 

Clusters of galaxies produce a large gravitational lens- 
ing effect on distant galaxies, therefore cluster counts 
and cosmic shear will not be strictly statistically inde- 
pendent. The volume surveyed is finite and therefore the 
number of clusters observed will not be exactly equal to 
the average over all universe realizations. If the num- 
ber of clusters happens to be higher for a given survey 
region, then the cosmic shear signal is also likely to be 
higher. Although the volumes will be large, and thus the 
deviation is small, this may amount to a significant un- 
certainty in the dark energy parameters as obtained by 
cluster counts, and dominates the non-Gaussian errors 
on the cosmic shear [isHielliTl. lisj . 

One aspect of this cross-correlation was discussed in 
[jgj and found to be negligible. However, here we make 
a full treatment of this effect using the halo model for 
non-linear structure formation, and quantify the result- 
ing change in joint constraints on the dark energy pa- 
rameters. 

The structure of our paper is as follows. In § |lT] we 
describe how our observables, cluster number counts and 
lensing power spectra, can be expressed in terms of the 
background cosmological model and the density pertur- 
bations. In § we describe a methodology to compute 
covariances of the cluster counts and the lensing power 
spectra, and the cross-covariance between the two ob- 
servables. The detailed derivations of the covariances are 
presented in Appendix. In § IIVI we first study the total 
signal-to-noise ratios expected for a joint experiment of 
the cluster counts and the lensing power spectrum fully 
including the cross-covariance predicted from the ACDM 
cosmologies. We then present forecasts for cosmological 
parameter determination for the joint experiment, with 
particular focus on forecasts for the dark energy param- 
eter constraints. Finally, we present conclusions and dis- 
cussion in § IVl 



II. PRELIMINARIES 
A. A CDM model 

We work in the context of spatially flat cold dark mat- 
ter models for structure formation. The expansion his- 
tory of the universe is given by the scale factor a(t) in a 
homogeneous and isotropic universe (e.g., see [50|). We 



describe the Universe in terms of the matter density 
(the cold dark matter plus the baryons) and dark energy 
density Ode at present (in units of the critical density 
3H^/{8ttG), where Hq = 100 h km Mpc"^ is the 
Hubble parameter at present). In general the expansion 
rate, the Hubble parameter, is given by 



-3 



doe 



-3 f^da'(l+w(a'))/a' 



(1) 



where we have employed the normalization a(to) = 1 
today and w{a) specifies the equation of state for dark 
energy as w{a) = Pdc{o) I Pdc{o)- Note that Slm-l-rido = 1 
and w — ~1 corresponds to a cosmological constant. The 
comoving distance x(a) from an observer at a = 1 to a 
source at a is expressed in terms of the Hubble parameter 
as 



X(a) 



da' 



(2) 



This gives the distance-redshift relation xi^) via the re- 
lation 1 ~\- z = 1/a. 

Next we need the redshift growth of density pertur- 
bations. In linear theory after matter-radiation equal- 
ity, all Fourier modes of the mass density perturbation, 
5{x){= 5pm{x) / pm), grow at the same rate, the growth 
rate (e.g. see Eq. 10 in [sTi for details). 



B. Number counts of galaxy clusters 

The galaxy cluster observables we will consider in this 
paper are the number counts drawn from a given survey 
region. Clusters can be found via their notable observa- 
tional properties such as gravitational lensing, member 
galaxies, X-ray emission and the SZ effect. For number 
counts we simply treat clusters as points; in other words, 
we do not care about the distribution of mass within a 
cluster. Hence, the number density field of clusters at 
redshift z can be expressed as 



nc\{x) 



S{mi; z)5\,{x - Xi), 



(3) 



where 5^jj{x) is the three-dimensional Dirac delta func- 
tion. The summation runs over halos (the subscript i 
stands for the i-th halo), and S{mi]z) denotes the se- 
lection function that discriminates the halos used for the 
cluster number counts statistic from other halos. 

In this paper, we will consider the following two toy 
models for the selection function, to develop intuition 
for the importance of cross-correlation between cluster 
counts and the lensing power spectrum and to make 
a comparison between cosmological parameter estima- 
tions derived from different cluster samples. Note that 
throughout this paper we will ignore uncertainties asso- 
ciated with cluster mass-observable relation, which could 
significantly degrade the ability of cluster counts for con- 
straining cosmological parameters (e.g. [1^). We shall 
discuss this issue in § IIV El 
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A mass-limited cluster sample - The first toy model we 
will consider is a mass-limited cluster sample. For this 
model, we include all halos with masses above a given 
mass threshold: 



S{m; z) 



1, if m > M„ 
0, otherwise. 



To a zero-th order approximation, the mass-limited se- 
lection may mimic a cluster sample derived from a flux- 
limited survey of clusters via the SZ effect, as this effect 
is free of the surface brightness dimming effect (e.g. see 

El). 

A lensing-hased cluster sample - A lensing measure- 
ment allows one to make a reconstruction of the two- 
dimensional mass distribution projected along the line 
of sight [s^l. A high peak in the mass map provides a 
strong candidate for a massive cluster (see [42, 43, 44] 
for an implementation of this method to actual data). 
To be more explicit, one can define height or significance 
for each peak in the reconstructed mass map using the 
effective signal-to-noise ratio (see (HJ] for details): 



K^cluster('7l, z) 



cluster 



(TN 



(4) 



Here Kciustcr is the convergence amplitude due to a given 
cluster at redshift z and with mass m, and ctn is the 
rms fluctuations in k due to the intrinsic ellipticity noise 
arising from a finite number of the background galax- 
ies. Note that we assume an NFW profile [55] with pro- 
file parameters modeled in [5^ , and consider the conver- 
gence field smoothed with a Gaussian filter of angular 
scale 9s = 1'. To compute the (5'/A^)ciuster for a clus- 
ter at redshift z, we take into account the remaining 
fraction of background galaxies behind the cluster for 
a given redshift distribution of whole galaxy population 
(see § IIV A[) . This accounts for the variation of mean 
redshift and number density of the background galaxies 
with cluster redshift, which changes both the signal and 
the intrinsic noise in Eq. 

From the reconstructed mass map, a cluster sample 
may be constructed by counting mass peaks with heights 
above a given threshold, z^min: the selection function is 
given by 



S{m] z) 



1, ii{SlN) 

cluster 

0, otherwise. 



As carefully investigated in [54| . the minimum mass of 
clusters detectable with a given threshold varies with 
cluster redshift; clusters at medium redshift between ob- 
server and a typical source redshift are most easily de- 
tectable, while only more massive clusters can be de- 
tected at redshifts smaller and greater than the medium 
redshift, as discussed below. 

We will employ the halo model to quantify the statis- 
tical properties of cluster observables. In the halo model 
approach, we assume that all the matter is in halos. Fol- 
lowing the formulation developed in [5^, [53, [H, [5^ (also 



see Appendix lA II and [60| for a thorough review) , the 
ensemble average of Eq. ([3]) can be computed as 

^ci = {nc\{x)) = l^^S{mi]z)5]j{x - Xi)^ 

= l^jdm Jdx'y^S{m; z)5%{x ~ x') 

X (5£,(rn - mi)&\)[x' - Xi)) 
rdmSim;zMm)[dx%ix-x') 



dm S{m; z)n{m), 



(5) 



where n{m) is the halo mass function corresponding to 
the redshift considered and we have used the ensemble 
average (5_D(m — mi)S^{xi — x')) — n{m). Thus, as 
expected, the ensemble average of the cluster number 
density field is given by the integral of the halo mass 
function, which does not depend on the cluster distribu- 
tion and spatial position. For the halo mass function, we 
employ the Sheth-Tormen fitting formula 16 III . modified 
from the original Press-Schechter function [64] . Note that 
we use parameter values a = 0.75 and p = 0.3 in the for- 
mula following the discussion in [H . We assume that the 
mass function can be applied to dark energy cosmologies 
by replacing the growth rate appearing in the formula 
with that for a dark energy model [g^] . 

A more useful quantity often considered in the liter- 
ature is the total number counts of clusters available 
from a given survey, which is obtained by integrating 
the three-dimensional number density field over a range 
of redshifts surveyed. Cluster redshifts are rather easily 
available even from a multicolor imaging survey alone be- 
cause their central bright galaxies, or red sequence galax- 
ies, have secure photometric redshift estimates. Having 
these facts in mind we will use as our observable the an- 
gular number density averaged over a survey area and 
divided into redshift bins: 

Id^e wie) / dx 



J'^ " "'"'Jo dxdn 
^ ^S(^b){rrH;z)6D{x - Xi)Sd{x0 - Xi^i), (6) 

i 

where W{6) is the window function of the survey defined 
so that it is normalized as Jd^6W{9) — I, xh is the 
distance to the Hubble horizon, and the comoving volume 
per unit comoving distance and unit steradian is given by 
d^V/dxdn, = for a flat universe. The subscript in the 
round bracket, (&), stands for the 6-th redshift bin for the 
cluster number counts. In the following, we will simply 
consider the sharp redshift selection function 



S(b) (rm; z) 



S{mi), ifz(b)jower 

0, otherwise. 



< z < z 



(6) .upper 



(7) 



Note that the redshift z appearing in the argument of 
S'(ft)(mi; z) is related to the comoving distance x via the 
relation dx = dz/H{z). 
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Using the halo model, the expectation value of the an- 
gular number density can be computed from the ensemble 
average of Eq. ([6]) as 



N, 



d'ew{e) / dx 



XH ^2y 







dxd^ 



dm S (j,){m; z)n{m) 
(8) 



Thus, the expectation value again does not depend on 
the cluster distribution. The sensitivity of the number 
density to dark energy arises from the comoving volume 
and the mass function n(m) f2ll|. 



angular number density of clusters 




1 1.5 
redshift z 

FIG. 1: The average angular number density of halos with 
masses above a given threshold, per unit square arcminute 
and per unit redshift interval. The upper pair and lower pair 
of curves are for halos with M/Mq > 10^* and 5 x lO", 
respectively. Increasing the dark energy equation state from 
It) = —1 to TO = —0.9 decreases the number density, as shown 
by the dashed curves. 

Fig.[T]shows the average angular number density of ha- 
los with masses greater than a given threshold, per unit 
square arcminute and per unit redshift interval assum- 
ing the fiducial model defined in IIV Al Increasing the 
dark energy equation of state from our fiducial model 
w = —1.0 to u> = —0.9 decreases the number density, 
because the change decreases both the comoving vol- 
ume d^V/dxdfl and the number density of cluster-scale 
halos, for a given CMB normalization of density per- 
turbations. Comparing the results for mass thresholds 
Mmin/Mg = lO^^^ and 5 X lO" clarifies that a factor 
5 increase in the mass threshold leads to a significant 
decrease in the number density, reflecting the mass sen- 
sitivity of the halo mass function in its exponential tail. 

In Fig.[2]we present the number density for the lensing- 
based cluster sample in which clusters having a lensing 
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FIG. 2: As in the previous plot, but shown here is the number 
density for the lensing-based cluster sample, where clusters 
having a lensing signal greater than a given detection thresh- 
old are selected in the sample as described around Eq. (O. 
The dashed, solid and dotted curves show the results for the 
detection thresholds (5'/A')ciustcr > 6, 8 and 10, respectively. 
For comparison, the two dot-dashed curves show the num- 
ber density for the mass-selected cluster sample with masses 
M/Mq > 5, 10 X 10^". Increasing wo from wq = -1.0 to 
Wo ~ —0.9 leads to a decrease in the number density as shown 
by the thin-solid curve, compared to the bold-solid curve. The 
lensing selected number densities peak at a redshift z ~ 0.25, 
reflecting redshift dependence of the lensing efficiency func- 
tion for source galaxies at Zs ~ 1. 



signal greater than a given detection threshold are in- 
cluded in the sample as discussed around Eq. dSj). Note 
that to compute the results shown in this plot we as- 
sumed the redshift distribution of galaxies described in 
§ IIV Al and the NEW mass profile to model the cluster 
lensing. In practice high detection thresholds such as 
(S'/iV)ciustcr ^ 6 are necessary in order to make robust 
estimates for cluster counts, because contamination of 
false peaks due to intrinsic ellipticities or the projection 
effect are expected to be low for such high thresholds (see 
[HilsBl for the details). Comparing with the number den- 
sity for a mass-selected sample shown by the dot-dashed 
curves, one can roughly find which mass and redshift 
ranges of clusters are probed by the lensing-based clus- 
ter sample. For example, the cluster sample with lensing 
signal (5'/A^)ciustor > 10 contains massive clusters with 
masses M ^ lO^^M© over redshift ranges z ^ 0.4, while 
only even more massive clusters are included in the sam- 
ple at the higher redshifts. This cluster sample has a 
narrower redshift coverage than the simple mass thresh- 
old; all the curves peak at a redshift z ^ 0.25. The peak 
redshift is mainly attributed to redshift dependence of 
the lensing efficiency for source galaxies of z^ 1 in our 
redshift distribution. A change of wq from wq — —1.0 to 
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Wo = —0.9 leads to a decrease in the number density, as 
seen in Fig. [TJ As before the effect comes partially arises 
from the decrease in comoving volume and the change in 
the halo mass function. Unlike the simple mass thresh- 
old case, there is now an additional contribution to the 
decrease in number density caused by the lower lensing 
efhciency and thus lower S/N for a cluster of a given 
mass and redshift. 



C. Lensing power spectrum with tomography 

Gravitational shear can be simply related to the lensing 
convergence: the weighted mass distribution integrated 
along the line of sight. Photometric redshift information 
on source galaxies allows us to subdivide galaxies into 
redshift bins (we will discuss possible effects of photomet- 
ric redshift errors on our results in S IIVE|) . This allows 
more cosmological information to be extracted, which is 
referred to as lensing tomography (e. g., s ee [6a . [stI . [68| 
for a thorough review, and see [33, [3^, l39l| for the details 
of lensing tomography) . 

In the context of cosmological gravitational lensing the 
convergence field with tomographic information is ex- 
pressed as a weighted projection of the three-dimensional 
mass density fluctuation field: 

H^)W= r"dxW^,)g{x)5[x.xOl (9) 
Jo 

where 9 is the angular position on the sky, and is 
the gravitational lensing weight function for source galax- 
ies sitting in the i-th redshift bin (see Eq. (10) in [3^ for 
the definition). Note that, hereafter, quantities with sub- 
scripts in the round bracket such as (i) stands for those 
for the i-th redshift bin. To avoid confusion, throughout 
this paper we use i, j or for the lensing power spec- 
trum redshift bins, and &, b' for the cluster count redshift 
bins. 

The lensing tomographic information allows us to ex- 
tract redshift evolution of the lensing weight function as 
well as the growth rate of mass clustering. These are 
both sensitive to dark energy. For example, increasing 
the equation of state parameter w from w = —1 lowers 
as well as suppressing the growth rate at lower red- 
shifts. Therefore when the CMB normalization of density 
perturbations is employed, an increase in w decreases the 
lensing power spectrum due to both the lower W(i)g and 
the lower matter power spectrum amplitude. The sensi- 
tivity of lensing observables to the dark energy equation 
of state roughly arises equally from the two effects (e.g., 
see dil). 

The cosmic shear fields are measurable only in a sta- 
tistical way. The most conventional methods used in the 
literature are the shear two-point correlation function. 
The Fourier transformed counterpart is the shear power 
spectrum. The convergence power spectrum is identical 
to the shear power spectrum but is easier to work with as 
it is a scalar. Using the flat-sky approximation [70,] , the 



angular power spectrum between the convergence fields 
of redshift bins i and j is found to be 

i^fe>(0 = £''dxW^,-,g{x)w^,-,g{x)x-^Ps(^k - , 

(10) 

where Ps{k) is the three-dimensional mass power spec- 
trum. We can safely employ the flat-sky approximation 
for our purpose, because a most accurate measurement 
for the lensing power spectrum is available around multi- 
poles I ^ 1000 for a ground-based survey of our interest 
(e.g. see Fig. 1 in [71[), and the flat-sky approximation 
serves as a very good approximation on these small scales 

For / }t 100 the major contribution to P(ij)K{£l comes 
from non- linear clustering (e.g., see Fig. 2 in [33 ). We 
employ the fitting formula for the non- linear Ps{k) pro- 
posed in Smith et al. jTSj , assuming that it can be applied 
to dark energy cosmologies by replacing the growth rate 
used in the formula with that for a given dark energy 
model. We note in passing that the issue of accurate 
power spectra for general dark energy cosmologies still 
needs to be addressed carefully (see 0, [75[ for related 
discussion). Fig. [3] demonstrates how lensing of back- 
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FIG. 3: A lensing power spectrum for the non-tomographic 
case (i.e. one redshift bin) for a ACDM model, expected for 
a ground-based survey that probes galaxies with mean red- 
shift (zs) = 0.9. The two thin dashed curves show the 1- and 
2-halo term contributions to the power spectrum, while the 
bold curve shows the total power. The three thin solid curves 
show the 1-halo term contributions obtained when the lens- 
ing effects on background galaxies due to halos with masses 
M/Mq > 10^^, 10", 10" are included, respectively. 

ground galaxies by clusters contributes to the lensing 
power spectrum. Note here that we have employed the 
halo model developed in Takada & Jain [s^ [5l| to com- 
pute the mass power spectrum, although we will use the 
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Smith et al. fitting formula to compute the lensing power 
spectrum in most parts of this paper instead. Briefly, to 
compute the spectra based on the halo model approach, 
we need to model three ingredients: (i) the halo mass 
function (see also the description below Eq. [5]); (ii) the 
profile for the mass distribution around a halo; and (iii) 
the halo bias parameter. 

It is clear that the convergence on scales I ^ 100 is sig- 
nificantly boosted by the existence of non-linear struc- 
tures, halos. In this paper we are especially interested in 
using the lensing information inherent in angular scales 
I ^ 3000 ^ to constrain dark energy, and a fair fraction 
of the power at scales I ~ 10'^, up to ~ 60% of the total 
power, arises from massive halos with Af <L 10^** M0. The 
1-halo term contribution is given by redshift-space inte- 
gral of the halo mass function and halo profiles weighted 
with the lensing efhciency. The results imply that, if 
massive clusters with M ^ IO^^M^q happen to be less or 
more populated in a survey region, amplitudes of the ob- 
served lensing power spectrum from the survey are very 
likely to be smaller or greater than expected, respectively. 
Therefore, a cross-correlation between the lensing power 
spectrum and the cluster counts are intuitively expected, 
if both of the observables are measured from the same 
survey region. 

In reality, the observed power spectrum is contami- 
nated by the intrinsic ellipticity noise. Assuming that 
the intrinsic ellipticity distribution is uncorrelated be- 
tween different galaxies, the observed power spectrum 
between redshift bins i and j can be expressed as 



(11) 



where (Xg is the rms of intrinsic ellipticities per compo- 
nent, and n(j') denotes the average number density of 
galaxies in the i-th redshift bin. The Kronecker delta 
function, 5/^, accounts for the fact that the cross-spectra 
of different redshift bins {i ^ j) are not affected by the 
shot noise contamination. We will omit the superscript 
'obs' when referring to P°^^(l) in the following for nota- 
tional simplicity. 



III. COVARIANCES OF LENSING POWER 
SPECTRUM AND CLUSTER OBSERVABLES 

To estimate a realistic forecast for cosmological pa- 
rameter constraints for a given survey we have to quan- 
tify sources of statistical error on observables of interest, 
the cluster number counts and the lensing power spec- 
trum, and then propagate the errors into the parameter 



forecasts. In this section, we will present the covariance 
matrices of the observables. 



A. Covariances of the cluster number counts 

The cluster observables can be naturally incorporated 
in the halo model approach, allowing us to compute the 
statistical properties in a straightforward way. In this 
paper we focus on the average angular number density of 
clusters drawn from a survey, also subdivided into red- 
shift bins as described in § III Bl The covariance between 
the average number densities in redshift bins b and 
given by Eq. ([5]), is defined as 



(12) 



Based on the halo model the covariances of the angular 
number density can be derived in Appendix IB II falso see 
[B^ I for the original derivation) as 



\bb' 



Nib) 



+(>bb' 



dm n{'m)Sii,-){'m; x)b{m) 



(13) 



where b{m) is the halo bias parameter ([Ty]; we use 
the model derived in [Sli), Psi^) is the linear mass 
power spectrum, and W{x) is the Fourier transform of 
the survey window function; for this we simply employ 
W{l&s) = 2Ji{lQs)/{lQs) {Ji{x) is the 1-st order Bessel 
function) assuming a circular geometry of the survey re- 
gion. Vis = T^&l- 111 lli6 following, the tilde symbol is 
used to denote the Fourier components of quantities. To 
derive the covariance (jl3p . we have ignored correlations 
between the number densities between different redshift 
bins, which would be a good approximation for a redshift 
bin thicker than the correlation length of the cluster dis- 
tribution. 

The first and second terms in Eq. (|13p arise from the 1- 
and 2-halo terms in the halo model calculation; the for- 
mer gives the shot noise due to the imperfect sampling 
of fluctuations by a finite number of clusters, while the 
latter represents the sampling variance arising from fluc- 
tuations of the cluster distribution due to a finite survey 
volume. It should be noted that our formulation allows 
us to derive the shot noise term without ad hoc introduc- 
ing as often done in the literature (e.g., see j2lj). The 
two terms in Eq. (|13p depend on sky coverage in slightly 
different ways^, and the relative importance depends on 
the survey area; for a larger survey, the sampling variance 
could be more important than the shot noise [63]. 



^ At the smaller angular scales I ^ 3000, more complex uncertain- 
ties in non-linear clustering such as the baryonic effects arise, ^ If a new integration variable x = IQs is introduced for the second 
which need to be addressed more carefully. term of Eq. I I13I I. one can find the sky coverage dependence is 
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B. Covariances of lensing power spectra 

In reality the lensing power spectrum has to be esti- 
mated from the Fourier or spherical harmonic coefficients 
of the observed lensing fields constructed for a finite sur- 
vey. In this paper we assume the flat-sky approximation 
and thus use Fourier wavenumbers ^, which are equiva- 
lent to spherical harmonic multipoles i in the limit ^ 3> 1 
[7^ . Because the survey is finite, an infinite number of 
Fourier modes are not available, and rather the discrete 
Fourier decomposition has to be constructed in terms 
of the fundamental mode that is limited by the survey 
size; If = 27r/-\/n7, where fig is the survey area. We as- 
sume a homogeneous survey geometry for simplicity and 
do not consider any complex boundary and/or masking 
effects. The lensing power spectrum of a multipole / is 
observationally estimated by averaging over wavenumber 
direction in an annulus of width AZ 

(14) 



where the integration range is confined to the Fourier 
modes of V satisfying the bin condition Z — AZ/2 < /' < 1+ 
Al/2 and A{1) denotes the integration area in the Fourier 
space approximately given by A{1) = (Pi' « 2nlAl. 

This is discussed in more detail in Appendix IB 21 

Once an estimator of the lensing power spectrum is 
defined, it is straightforward to compute the covariance 
[461. 1771] (also see [3l for the detailed derivation). From 
Eq. (jB12p . the covariance to describe the correlation be- 
tween the lensing power spectra of different multipoles 
and redshift bins is given by 

[c%nn ^ (/^(■^^joi^(?-ojn>-i^fe)K(o^(.',').(n 

<Pq' 



{21 + l)AlUy 

1 r dp 



sky J\q\£l 



A{l'y 



-T, 



(iji'j' 



(15) 

where /sky is the sky coverage (/sky = f^s/47r) and the 
lensing trispectrum is defined in terms of the 3D mass 
trispectrum Ts as 

rXH 



T{iji'j')K{,h,h, ^3; h 



xx~^Ts{ki,k2,k3,kr,x.), (16) 

with ki — li/x- Note that the power spectra P{ij)K ap- 
pearing on the r.h.s. of Eq. (fTS]) are the observed spec- 
tra given in Eq. (|f ip . and therefore include the intrin- 
sic ellipticity noise. The indices m, n denote elements 



in the lensing power spectrum covariance and run over 
the multipole bins and redshift bins. For tomography 
with Uz redshift bins, there are nz{nz + l)/2 different 
spectra available at each multipole. Hence, if assum- 
ing ni multipole bins, the indices m,n run as m,n = 
1,2,... ,ninz{nz + l)/2. In most parts of this paper we 
adopt 100 multipole bins logarithmically spaced, which 
are sufficient to capture all the relevant features in the 
lensing power spectrum. For example, for tomography 
with 3 redshift bins, the covariance matrix has di- 
mension of 600 X 600 for n/ = 100. 

The first term of the covariance matrix (second line of 
Eq. [I5j ) represents the Gaussian error contribution en- 
suring that the two power spectra of different multipoles 
are uncorrelated via 5^, , while the second term gives the 
non-Gaussian errors to describe correlation between the 
different power spectra. The two terms both scale with 
sky coverage as oc I//sky Note that the non-Gaussian 
term does not depend on the multipole bin width Al be- 
cause of J(Pq/A{l) « 1, and taking a wider bin only re- 
duces the Gaussian contribution or equivalently enhances 
the relative importance of the non-Gaussian contribu- 
tion. Naturally, however, the signal-to-noise ratio and 
parameter forecasts we will show below do not depend 
on the multipole bin width if the bin width is not very 
coarse (see [431 details). 

We employ a further simplification to make quick com- 
putations of the lensing covariance matrices. We use the 
halo model approach to compute the lensing covariance 
matrices. We know that most of the signal in the power 
spectrum comes from small angular scales at I ~ 10'^ to 
which the I-halo term provides dominant contribution as 
shown in Fig. [3] In addition, the non-Gaussian errors are 
important only at small angular scales. For these reasons, 
we only include the I-halo term contribution to the lens- 
ing trispectrum to compute the non-Gaussian errors. Al- 
though the trispectrum generally depends on four vectors 
in the Fourier space such as Zi,i2,^3 and I4, the 1-halo 
term does not depend on any angle between the vectors, 
but rather depends only on the length of each vector; 
T^'^ih, 12,13,14) = T^''{h,l2,h,k) (see SHI Hi), re- 
fleeting spherical mass distribution around a halo in a 
statistical average sense, which does not have any pre- 
ferred direction in the Fourier space. Therefore, the non- 
Gaussian term in Eq. (llSp can be further simplified as 



1 f (Pq f (Pq' 



47r/sky 



(17) 



expressed as oc (l//sky) J^d,^ xP(k = x /Qax)\^{^)\^ , which 
looks similar to the /aj^y dependence of the first term given as 
oc l//sky. However, the /aky dependence could be different via 
the dependence in P(k = x/SsX) for the 2nd term. 



where we have assumed that the lensing trispectrum does 
not change significantly within the multipole bin, which 
is a good approximation for the lensing fields. 
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C. Cross-covariances of the cluster number counts 
and lensing power spectra 

The cluster observables and the weak lensing power 
spectra probe the same density fluctuation fields in large- 
scale structure, if the two observables are drawn from 
the same survey region. As implied in Fig. [3l a some- 
what significant correlation between the two observables 
is expected if the small-scale lensing power spectrum is 
considered. We again use the halo model to compute the 
cross-covariance. The detailed derivation is described in 
Appendix IB 31 and the cross-covariances can be expressed 
as 



(^f.^>(OA/'w>-^fe>(0%) 



's Jo 



' dxdfl 



(18) 



Here Bf^f,-^^^^ is the 3D bispectrum corresponding to the 
three-point function of the cluster distribution and the 
two mass density fluctuation fields. The cross-covariance 
arises from two contributions of the 3D bispectrum, the 
1- and 2-halo terms: 



'(b)cssy 



dm n{m)SQj) (m 



B?L.,{k;x) 



dmi n{mi)b{mi)— — Umi{k) 

Pm 



X 



777-2 

dm2 n{m2)S(^i){m2)h{rn2)—Urn2{^) 

Pm 



(19) 



where Um is the Fourier transform of a halo profile for 
which we assume an NFW profile [s^ as explicitly de- 
fined in Eq. (|A14|1 . The cross-covariance arising from 
the 1-halo term represents correlation between one clus- 
ter, treated as a point, and the lensing effects on two dif- 
ferent background galaxies due to the same cluster. The 
2-halo term contribution shows the correlation between 
one cluster, the lensing field on a background galaxies 
around the cluster, and the lensing field due to another 
cluster. Note that the cross-covariance is derived as- 
suming the flat-sky approximation as we focus mainly on 
small angular scale information, but the full-sky expres- 
sion can be derived combining the methods developed in 
this paper and in [t^I ■ 

Fig. m shows the cross-covariance between the mass- 
selected cluster counts and the weak lensing power spec- 
trum as a function of angular multipole Z, for a concor- 
dance ACDM model. For illustrative clarity we use a sin- 
gle redshift bin for both of the cluster counts and the lens- 
ing power spectrum. The dashed, solid and dotted curves 
are the results obtained when minimum halo masses of 
A^min/lO^'^Mo = 1,5 and 10 are assumed for the clus- 
ter counts, respectively. The two thin, solid curves show 
the 1- and 2-halo term contribution to the total power 
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n =5000 deg^ 




200 500 

multipole I 

FIG. 4: The cross-covariance between the cluster counts M 
and the lensing power spectrum P^''' {I) as a function of the 
angular multipole I for a ACDM model. Note that for the 
purpose of this Figure we assume a single redshift bin for the 
lensing power spectrum for a typical ground-based survey (see 
§ IIV A|) and, for the cluster counts, we include all the clus- 
ters with masses above a given minimum halo mass over a 
range of redshifts Q < z < 1. The dashed, solid and dotted 
curves demonstrate the results when minimum halo masses 
of Mmin = 1,5 and 10 x 10^* M© are employed, respectively. 
The two thin, solid curves show the 1- and 2-halo term con- 
tributions to the cross-covariance of Mmin = 5 X 10" M© (see 
Eqs. [18] and [19)). while the thin dashed and dotted curves 
show the 1-halo term contribution. 



(bold solid curve) for the Minin/1O"'^''M0 = 5 mass cut. 
It is apparent that the cross-covariance at small angular 
scales I ^ 500 arises mainly from the 1-halo term contri- 
bution. Comparing the dashed, solid and dotted curves 
clarifies that the covariance amplitude gets greater with 
decreasing minimum halo mass, as the weak lensing and 
the cluster counts probe more similar density fields in the 
large-scale structure as implied in Fig. [3] 

A more useful quantity is the cross-correlation coeffi- 
cients defined as 



T{1) 



(20) 



where the subscript '1' denotes the first redshift bin be- 
cause for this calculation we are putting all the clusters 
into a single redshift bin, for illustration (the cluster red- 
shift bin index & = 1 for this case). The coefficients 
quantify the relative importance of the cross-covariance 
to the auto-covariances at a given /. The upper panel of 
Fig. [5] shows the correlation coefficients for model param- 
eters assumed in Fig. U) The coefficients depend on the 
multipole bin width taken in the lensing power spectrum 
covariance calculation as well as on a survey sky cover- 
age; we here assumed M/l ~ 0.04 and fig = 5000 deg^, 
except for the thin solid curve where a full-sky survey 
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Minimum halo mass: ^^^^ 

FIG. 5: Upper panel: The cross-correlation coefficients, de- 
fined by Eq. (|20|l . as a function of multipole I. The coefficient 
depends on the multipole bin width and survey area; we as- 
sumed Sl/l ~ 0.04 and = 5000 deg^ (/sky ^ 0.12), except 
for the thin solid curve where we assumed a full-sky survey 
/sky = 1. Lower panel: A similar plot, but as a function of 
mass thresholds in the cluster counts, for a fixed multipole 
of the lensing power spectrum. The bold solid, dashed and 
dotted curves are the results for I — 3000, 1000 and 500, re- 
spectively. The thin solid curve shows the result for I = 3000 
if the intrinsic ellipticity noise is ignored. 



/sky = 1 is considered. 

The upper panel of Fig. [5] shows that the coefRcients 
peak around I ^ 1000, and decrease at smaller scales. 
On the intermediate scales there is a significant cross- 
correlation since the 1-halo term in the lensing power 
spectrum depends so strongly on the number of clusters 
(Fig. [3|) . However on smaller scales the lensing covariance 
is dominated by shot noise in the intrinsic galaxy shapes 
(e.g., see Fig. 1 in [7l[), which do not correlate with the 



cluster counts. Comparing the thin and bold solid curves 
shows that the coefhcients have only weak dependence on 
the sky coverage, reflecting that the sampling variance of 
the cluster count covariance roughly scales as f~^^ 
for a large area survey of our interest, which is same 
dependence as the other elements in the covariances. 

The lower panel of Fig. [5] shows the correlation co- 
efRcients with varying mass thresholds in the cluster 
counts, for a fixed multipole I of the lensing power spec- 
trum. The lensing power spectrum at / ~ 1000 is found 
to be most correlated with the cluster counts for the 
Mjnin ^ IO^^Mq mass cut. The correlation decreases 
at high mass thresholds when the number of clusters is 
very small and therefore not representative of the lensing 
field. The correlation also decreases at smaller masses 
since the contribution to the lensing power spectrum is 
small for light halos (Fig [3]). As can be seen from com- 
parison of the bold and thin solid curves, an inclusion 
of the intrinsic ellipticity noise suppresses the correlation 
coefficients. 



(=1000 




redshift z 

FIG. 6: The relative contribution of each redshift clusters to 
the cross-covariance at a given multipole. We assumed the 
same model parameters as in Fig. |4] For an angular scale of 
I = 1000, clusters at 2 ~ 0.2 most contribute to the cross- 
covariance, reflecting the peak of the lensing efficiency func- 
tion. Note that we are here using the simple mass threshold 
for the cluster selection, not the lensing-based selection. On 
the other hand, for a larger angular scale of I — 100, most of 
contribution comes from clusters at lower redshifts. 

We now consider multiple redshift bins in the cluster 
catalog. Fig. [6] shows the relative contribution of each 
cluster count redshift bin to the cross-covariance at a 
given multipole. It is clear that clusters at z ~ 0.2 — 0.3 
contribute most to the cross-covariance for an angular 
scale of I ^ 1000. Since the number density of mass- 
selected cluster counts has a weak redshift dependence 
as shown in Fig. [H one can notice that the redshift peak 
reflects redshift dependence of the lensing efficiency func- 
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tion for a source redshift Zg 1 that corresponds to the 
survey depth we assumed; structures at2:~0.2— 0.3 most 
efficiently cause the lensing effect on source galaxies at 
~ 1. It is also worth noting that clusters at higher 
redshifts have a smaller angular size (smaller virial radii) 
than 9^1/1 (e.g. see the right panel of Fig. 2 in [5§|). In 
other words, clusters at z 0.5 may carry complemen- 
tary information to the lensing power spectrum. On the 
other hand, for an angular scale of I ~ 100, clusters at 
lower redshifts z 0.1 contribute most to the covariance, 
because the cluster virial radius matches such a large an- 
gular scale only if the cluster is located at lower redshift. 



IV. RESULTS: SIGNAL-TO-NOISE AND 
PARAMETER FORECASTS 

A. A CDM model and survey parameters 

To compute the observables of interest we need to spec- 
ify cosmological model and we assume survey parameters 
similar to those of future surveys in order to estimate re- 
alistic measurement errors. 

We include the key parameters that may affect the 
observables within an adiabatic CDM dominated model 
with dark energy component: the density parameters are 
17do(= 0.73), an/i^(= 0.14), and r2b/i^(= 0.024) (note 
that we assume a flat universe); the primordial power 
spectrum parameters are the spectral tilt, ns{— 1), the 
running index, as{— 0), and the normalization parameter 
of primordial curvature perturbation, S(^{— 5.07 x 10~^) 
(the values in the parentheses denote the fiducial model) . 
We employ the transfer function of matter perturbations, 
r(fc), with baryon oscillations smoothed out [t^. We 
employ the dark energy model [tqI [soj parametrized as 
w{a) = Wo + Wa(l — a), with fiducial values wq = — I and 
Wa = 0. 

We specify survey parameters that well resemble a fu- 
ture ground-based survey (e.g., see §]). We model the 
redshift distribution of galaxies by using a toy model 
given by Eq. (4) in [8l|; we employ the parameter value 
zq ~ 0.3 leading the redshift distribution to peak at 
•Zpcak = 2zo — 0.6 and have a mean redshift of Zm — 
3zo — 0.9. The intrinsic ellipticities dilute the lensing 
shear measurements according to Eq. (ITU) ; we simply as- 
sume that the shot noise contamination is modeled by 
the rms ellipticity per component, = 0.22, and the to- 
tal number density of galaxies, fig = 30 arcmin^^. The 
survey area is taken to be Slg = 5000 degree^ for our 
fiducial choice. Note that throughout this paper we will 
assume that the two observables we are interested in, the 
cluster number counts and the lensing power spectrum, 
are taken from the same survey region, to study how the 
cross-correlation affects the parameter constraints, as the 
two methods probe the same cosmic structure. 



B. A signal-to- noise ratio 
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FIG. 7: Upper panel: Signal-to-noise (S/N) ratio as a func- 
tion of the mass threshold of the cluster count measure- 
ment. The solid curve shows the total S/N for a combined 
measurement of the cluster counts and lensing power spec- 
trum where the cross-correlation between the two observables, 
Cov[A/'ci, Pk{1)], is included. For comparison, the dashed curve 
shows the S/N assuming that the two observables are in- 
dependent (i.e. the cross-correlation is ignored). The dot- 
ted and dot-dashed curves show the S/N when either of the 
cluster counts or the lensing power spectrum alone is consid- 
ered. Lower panel: The percentage difference in S/N with 
and without the cross-covariance Cov[A/'ci, -P«;(0] (i-6., the dif- 
ference between the solid and dashed curves divided by the 
dashed curve in the upper panel xlOO). Interestingly, the 
cross- correlation improves the S/N by up to 10% when only 
massive clusters with A/ ^ 10^^ Mq are included in the counts 
(see text for a more detailed discussion). All curves assume a 
survey with an area of Qs ~ 5000 deg^. Note that we consid- 
ered a single redshift bin for both of the two observables, and 
included the number counts of clusters with masses greater 
than the mass threshold over redshifts 0.05 < z < 1.0 and 
the lensing power spectrum at multipoles over 50 < Z < 3000, 
respectively (see text for the details). 

It is instructive to investigate the expected signal-to- 
noise {S/N) ratio for a combined measurement of the 
cluster counts and the lensing power spectrum, in or- 
der to highlight how the cross-correlation between the 
two observables affects the measurement accuracies. The 
S/N can be estimated using the covariance matrix de- 
rived in § mil as 

(I)' ^j:DdiC^^T%D,. (21) 
y c+a 
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FIG. 8: Shown is the relative importance of each ingredient 
in the covariance calculation to the percentage difference in 
S/N. The solid curve shows the result for our fiducial model, 
as shown in the lower panel of Fig. [T] The dashed curve 
shows the results where only the 1-halo terms are included 
in each element of the full covariance matrix, while the thin 
dashed curve shows the results when only the 2-halo term 
contribution to the cross-covariance is included. The dotted 
and dot-dashed curves show the results when the intrinsic 
ellipticity noise or the non-Gaussian errors are ignored in the 
weak lensing power spectrum covariance, respectively. 



Here the data vector of our observables, £>, constructed 
from the lensing power spectrum tomography with Hs 
redshift bins and the cluster number counts with b- 
redshift bins is defined as 

D = {-P(11)k(^i), • • • , P{,i,n,)K{lmax) , N(^i'), • • • , iV(b)} . 

(22) 

Note that the dimension of D is b+ni x ns{ns + l)/2 when 
the lensing tomography with ni multipole bins and rig 
redshift bins and the cluster counts with b redshift bins 
are considered (also see below Eq. [TB]). For a case of 
b = 10, Us = 3 and ni = 100, the dimension of D is 610. 
The full covariance matrix for the joint measurement, 
0"+'^, can be constructed from Eqs. ([T31), Hn]), and O 
as 



fjg+c 



(jgc QC 



(23) 



Note that (C^^"^)^^ appearing in Eq. (|2ip is the inverse 
matrix of C^+^. 

For comparison we consider the S/N from each of clus- 
ter counts and weak lensing alone by using the relevant 
part of the data vector in Eq. (|22p and the covariance 
matrices. We also compare with the S/N if the cross- 
correlation is not taken into account, i.e. a matrix of 
zeros is used instead of the matrix C^^ in Eq. (|23p . In 
this case that the two are independent, e.g. measured 



from non-overlapping two survey regions, the S/N val- 
ues from each of the cluster counts and the lensing power 
spectrum alone therefore add in quadrature to form the 
joint S/N. 

When computing S/N in Eq. ((2T|) care must be taken 
with numerical accuracy of the matrix inversion. The 
observables of interest, the angular number density of 
clusters and the lensing power spectrum, have different 
units and their amplitudes could therefore differ from 
each other by many orders of magnitudes. To avoid nu- 
merical inaccuracies caused by this fact, we have used the 
dimension-less covariance matrix C^^"^ normalized by the 
data vector as 



(24) 



In terms of the re-defined covariance matrix, the to- 
tal S/N can be computed simply as (5/iV)^^^ = 

Fig. [7| shows the S/N ratios expected for a ground- 
based survey with area fis — 5000 deg^ and our fidu- 
cial ACDM model, as a function of minimum halo mass 
used in the mass-selected cluster counts. Here we in- 
clude all the clusters with masses greater than a given 
mass threshold over a range of 0.05 < z < 1, and include 
the lensing power spectrum at multipoles 50 < ^ < 3000 
assuming the redshift distribution of galaxies described 
in ij llV Al Note that we here consider a single redshift bin 
for both the cluster counts and cosmic shear power spec- 
trum for simplicity, and the signal-to-noise ratios only 
slightly increase by adding redshift binned information 
(e.g., see Fig. 5 in 39]). First of all, we should notice 
that the lensing power spectrum and the cluster number 
counts have similar S/N ratios, when the mass threshold 
Afmin ~ 1O"'^'*M0 is used. At mass thresholds smaller 
than 10"'^'*Mq, the cluster counts (dotted curve) have 
a greater S/N than the lensing power spectrum (dot- 
dashed curve) due to an increase in the number of sam- 
pled clusters, while the lensing power spectrum has a 
greater S/N at the greater mass threshold. 

The solid curve shows the total S/N for a combined 
measurement of the cluster counts and the lensing power 
spectrum, when the cross-correlation between the two 
observables is correctly taken into account for the full 
covariance matrix (see Eq. [53]). We compare this to the 
standard approach in which the two probes are consid- 
ered to be independent (dashed curve). The lower panel 
explicitly shows the percentage difference in S/N with 
and without the cross-covariance. 

At small mass thresholds < lO^-^M©, the total S/N 
taking into account the cross-covariance is degraded com- 
pared to when the probes are considered independent. 
This is because the cosmic density field probed is shared 
by the two observables and therefore an inclusion of the 
cross-covariance reduces independence of the two observ- 
ables. However, the degradation ceases at a critical mass 
scale where the total S/N (including the covariance) is 
equal to the S/N for the lensing power spectrum alone. 
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In other words, the total S/N is never smaller than the 
S/N obtained from either alone of the lensing power spec- 
trum or the cluster counts. 

Then, an intriguing result is found: the total S/N is 
slightly improved by including the cross-covariance as the 
mass threshold is increased up to M ^ IO^^A/q, where 
the improvement is up to ~ 10% as shown in the lower 
panel. This occurs even though the S/N ratio from the 
cluster counts alone is much less than that for the lensing 
power spectrum alone. The peak mass scale of the total 
{S/N) corresponds to the mass scale at which the corre- 
lation coefficient of the covariance peaks as can be found 
in the lower panel of Fig. [5l That is, the improvement in 
S/N could happen when the two observables are highly 
correlated. Since the cross-covariance describes how the 
two observables are correlated with each other, it appears 
that a knowledge of the number of such massive clusters 
with M ^ IQ^^Mq for a given survey region helps to im- 
prove the amount of information that can be extracted 
from the weak lensing measurement (also see [s^ . Issj for 
the related discussion). In simpler words, if a smaller or 
greater number of massive clusters than the ensemble av- 
erage value was observed from a given survey region, the 
observed lensing power spectrum will most likely have 
smaller or greater amplitudes at I ~ 10'^, respectively. 

We reproduce this qualitative behavior using a simple 
toy model described in the Appendix O where the lens- 
ing power spectrum is modeled to be given solely by the 
number of halos, ignoring the halo mass profile and the 
clustering of different halos. Based on this toy model we 
attribute the increase in the total S/N for high cluster 
mass thresholds to the fact that the lensing power spec- 
trum amplitude is sensitive to the number of such mas- 
sive clusters as demonstrated in Fig. [3l The fact that the 
lensing power spectrum is sensitive to the number counts 
weighted by the mass squared, means that it adds com- 
plementary information to the unweighted sum from the 
cluster counts. 

It should, however, be noted that the improvement in 
S/N is achievable only if the cross-covariance is a pri- 
ori known from the theoretical prediction e.g. based 
on CDM structure formation scenarios. Alternatively 
it could be obtained from a measurement of the cross- 
correlation from the survey region. 

In Fig. [5] we study which model ingredient in the 
fuU-covariance calculation mainly leads to the results in 
Fig. [7l The dashed curve shows the percentage differ- 
ence in S/N when only the 1-halo terms are included in 
each element of the full covariance matrix which 
corresponds to a simplified case that there is no clus- 
tering between halos. Compared to the solid line, or 
Fig. [71 the results are little different. The dot-dashed 
curve shows the result obtained when we ignore the in- 
trinsic ellipticity noise that contributes only to the di- 
agonal elements of the weak lensing power spectrum co- 
variance. Again only a small difference is found. On the 
other hand, the thin dashed curve shows the results when 
only the 2-halo term contribution to the cross-covariance 
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FIG. 9: Dependence of the percentage difference in S/N (see 
lower panel of previous plot) on survey area and maximum 
multipole /max, where information on the lensing power spec- 
trum over a range of 50 < ? < /max is included. The default 
is /max = 3000. 



is included, which attempts to reproduce the results in 
the previous work [i^. For this case, the impact of the 
cross-covariance on the S/N is negligible as concluded in 
[4^. Rather, it turns out that the most important ef- 
fect comes from the lensing trispectrum contribution to 
the lensing power spectrum covariance. If we switch off 
the non-Gaussian contribution, the percentage difference 
in S/N is significantly changed. In particular, there is 
a significant improvement in S/N by adding the cluster 
counts with M ^ lO^^M©, because ignoring the lensing 
trispectrum decreases the diagonal elements in the full 
covariance matrix and thus enhances the relative impor- 
tance of the cross-covariance. This also implies that the 
cosmic shear fields are highly non-Gaussian as carefully 
investigated in |48j. 

Fig. [HI demonstrates how this percentage difference in 
S/N depends on the sky coverage (/sky) and the max- 
imum multipole (/max) of the lensing power spectrum. 
All the curves in Fig. [5] are very similar, showing a weak 
dependence on /sky and Zmax- (Note however that the 
absolute S/N itself has a strong dependence.) Neverthe- 
less, there are several points to note when interpreting 
the results. Comparing the dotted, solid and dot-dashed 
curves clarify that, with increasing /max, the mass thresh- 
old corresponding to the dip in the percentage difference 
in S/N increases. This is because the lensing power spec- 
trum at higher multipoles is more sensitive to the cosmic 
density fields down to smaller length scales, and there- 
fore the cluster counts including less massive halos are 
more correlated with the lensing power spectra. Also the 
impact of the cross-covariance is reduced when assuming 
/max = 10'', compared to the fiducial case of /max = 3000, 
because the intrinsic ellipticity noise (shot noise) is dom- 
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FIG. 10: As in Fig. [T] but the total signal-to-noise for the 
lensing-based cluster counts (see Fig. [2]) combined with the 
lensing power spectrum measurement is shown against detec- 
tion thresholds of the cluster lensing signal (see Eq. [4]). Note 
that the plotting range of y-axis in the upper panel is smaller 
than that of Fig. [3 



inant in the lensing power spectrum covariance at such 
small scales. 

Similar to Fig. H Fig. [10] shows the total S/N ratios 
for a combined measurement of the lensing-based cluster 
counts and the lensing power spectrum, as a function 
of the cluster lensing-signal thresholds (see Eq. [?] and 
Fig. [2]). Notice that the plotting range of y-axis is roughly 
half of that in Fig. [T] Because the number densities for 
the lensing signal thresholds of interest are less than that 
for the mass-selected cluster sample (as shown in Fig. [2]) 
the lensing-based cluster counts do not contribute much 
to the total S/N ratios, compared to Fig. [T] Other than 
this difference, the behavior for the S/N curves found in 
Fig. [7] are similar to those in Fig. [TOl 



C. Fisher analysis for cosmological parameter 
constraints 

We now estimate accuracies of cosmological parameter 
determination, given the measurement accuracies of the 
observables, using the Fisher matrix formalism [s^. [ssj. 
This formalism assesses how well given observables can 
constrain cosmological parameters around a fiducial cos- 
mological model. The parameter forecasts we obtain de- 
pend on the fiducial model and are also sensitive to the 
choice of free parameters. Furthermore, the Fisher ma- 
trix gives only a lower limit to the parameter uncertain- 
ties, being exact if the likelihood surface around the lo- 



cal minimum is Gaussian in multi-dimensional parameter 
space. Ideally a more quantitative method would be used 
to explore the global structure to realize more accurate 
parameter forecasts. As described in § IIV Al we include 
all the key parameters that can describe varieties in the 
observables within ACDM model cosmologies. 

The Fisher information matrix available from the lens- 
ing power spectrum tomography is given by 



dpa 



(25) 



where the partial derivative with respect to the a-th cos- 
mological parameter pa is evaluated around the fiducial 
model, with other parameters pfs (a ^ (3) being fixed 
to their fiducial values. The error on a parameter pa, 
marginalized over other parameter uncertainties, is given 
by a^{pa) — {F~^)aa, where is the inverse of the 
Fisher matrix. 

Similarly, the Fisher matrix for the cluster counts is 
given by 
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For a combined measurement of the lensing power 
spectrum and the cluster counts, the Fisher matrix is 
calculated using the full covariance matrix defined by 
Eq. ^ (also see Eq. [24]) as 
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where the summation indices i,j run over the redshift 
and multipole bins of the tomographic lensing power 
spectra as well as the redshift bins of the cluster counts. 

Using probes of structure formation alone is not power- 
ful enough to constrain all the cosmological parameters 
simultaneously and well. Rather, combining the large- 
scale structure probes with constraints from CMB tem- 
perature and polarization anisotropics significantly helps 
to lift parameter degeneracies and, in particular, the dark 
energy parameters (e.g. [H,]!^). When computing the 
Fisher matrix for a given CMB data set, we employ 9 pa- 
rameters: the 8 parameters described in ? IIV Al plus the 
Thomson scattering optical depth to the last scattering 
surface, t(= 0.10). Note that we ignore the contribution 
to the CMB spectra from the primordial gravitational 
waves for siniplicity. We use the publicly-available CMB- 
FAST code [83| to compute the angular power spectra 
of temperature anisotropy, Cf"^, i?-mode polarization, 
C™, and their cross-correlation, C™. To model the 
measurement accuracies we assume the noise per pixel 
and the angular resolution of the Planck experiment that 
were assumed in [86j . 

To be conservative, however, we do not include the 
CMB information on the dark energy equation of state 
parameters, wq and Wa- We do this because essentially 
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angular positions of the CMB acoustic peaks constrain a 
degenerate combination of the curvature of the universe 
and the dark energy parameters, through their depen- 
dences on the angular diameter distance to the last scat- 
tering surface. We are assuming a flat universe and there- 
fore wish to remove the artificially good constraint on 
dark energy that we would get from the CMB. Note, how- 
ever, that our parameter forecasts shown below would not 
change significantly even for a non-fiat universe, because 
we focus on large-scale structure probes in combination 
with the CMB constraints, as carefully shown in [88]. 

We remove the CMB information on the dark energy 
parameters using the following steps. We first compute 
the inverse of the CMB Fisher matrix, i^^^MB' ^'^^ 

9 parameters in order to obtain marginalized errors on 
the parameters, and then re-invert a sub-matrix of the 
inverse Fisher matrix that includes only the rows and 
columns for the parameters besides wq and Wa ■ The sub- 
matrix of the CMB Fisher matrix derived in this way 
describes accuracies of the 7 parameter determination, 
having marginalized over degeneracies of the dark en- 
ergy parameters wq and wq with other parameters for 
the hypothetical Planck data sets. In addition, we use 
only the CMB information in the range of multipoles 

10 < I < 2000, and therefore we do not include the in- 
tegrated Sachs- Wolfe (ISW) effect that contribute to the 
CMB spectra mainly at low multipoles / ^ 10, because 
the ISW effect is very likely correlated with the cosmic 
shear power spectrum and cluster counts, and we will 
ignore the correlations in this paper. 

To obtain the Fisher matrix for a joint experiment 
combining the lensing and/or cluster experiments with 
the CMB information, we simply sum the two Fisher ma- 
trices as, e.g. F9+''+™^ = + F^mb because the 
CMB information can be safely considered as an indepen- 
dent probe to the low-z universe probes in our setting. 
Note that the final Fisher matrix such as _F9+c+cmb 
9x9 dimensions. 



D. Forecasts for parameter constraints 

When forecasting cosmological parameter determina- 
tion, we should notice that the redshift information inher- 
ent in the cluster number counts and the lensing power 
spectrum can be very powerful to significantly tighten 
cosmological parameter errors, especially dark energy pa- 
rameters (e.g., see [S^). In the following, we will assume 
3 redshift bins for lensing tomography and 10 redshift 
bins for the cluster number counts over 0.05 < z < 1.0, 
for a survey with 5000 deg^ area. The 'three' redshift 
bins for lensing tomography is a minimal choice to ob- 
tain non-degenerate constraints on the 'three' dark en- 
ergy parameters, ftdc, wq and Wa as implied from Fig. 3 
in [831, although four or more redshift bins lead to fur- 
ther improvement, albeit not so much, in the parameter 
errors. Since we ignore various systematic errors for both 
the cluster counts and the lensing tomography, we adopt 



a rather conservative setting for the lensing tomographic 
binning. However note that we have checked a different 
redshift binning for cluster counts in combination with 
the lensing tomography does not change the main results 
below significantly. An investigation into survey opti- 
mization for survey parameters (area and depth) and 
redshift binning will be presented elsewhere in a more 
practical manner taking into account possible effects of 
the systematic errors. 

We first consider mass-selected cluster counts, and 
Fig. [11] shows the expected 68% limits on each of the 
parameters fide, ""^o a function of mass 

thresholds in the cluster counts. In each case we have 
marginalized over the remaining 8 cosmological parame- 
ters (see ij llV Al and IIV Cl for the cosmological parameters 
used). It should be also noted that the errors on these 
4 parameters are enlarged only by ~ 10% without the 
CMB priors. In the upper panel of each plot, the solid 
curve shows the error on a given parameter when the 
cross-covariance between the two observables is correctly 
taken into account, while the dot-dashed curve shows the 
error obtained from the lensing tomography alone. Com- 
paring the solid and dot-dashed curves demonstrates that 
adding the cluster counts for the smaller mass cuts into 
the lensing tomography can more improve the errors on 
dark energy parameters, fide, wo and Wa, because the 
two observables depend on cosmological parameters in 
different ways and combining the two can lift the param- 
eter degeneracies (also see [4S|). To be more explicit, 
the errors are improved by 40% for mass threshold 
Afmin ^ 1O^^M0, while the errors are improved only 
slightly by - 5% for M^i^ IQi^M©. 

On the other hand, there is a complex behavior in 
the error on the primordial curvature perturbation, 5^. 
This is explained as follows. The cluster counts are very 
sensitive to the normalization parameter of the linear 
mass power spectrum (5^ for our case or erg often used 
in the literature) through the sharp exponential cut-off 
of the halo mass function at high mass end. As we re- 
duce the minimum mass threshold down to the range 
3 X 10"^'^ ^ Minin/M0 ;S 10^^ we are beginning to lose 
information about the number of very high mass clus- 
ters, which are diluted in the total count by the large 
number of low mass clusters. At much lower mass cuts 
M^in £ 3 X 1O^^M0, the cluster counts come back to 
yield a tighter constraint on 5^ than the lensing tomogra- 
phy through the better measurement accuracy due to the 
larger number of very low mass halos. It would be also 
worth pointing out that knowing the number of clusters 
can effectively allow the lensing power spectrum to yield 
more information on the linear theory part of the power 
spectrum (e.g. one could subtract off the contribution 
from the clusters to get at the two halo term) . Therefore 
the constraint on the power spectrum amplitude can be 
partly improved from the joint constraint from the mass 
function and the linear power spectrum. Note that in this 
paper we consider a simple mass threshold for the cluster 
counts, however if clusters can be binned by mass then 



15 




FIG. 11: The projected 68% C.L. error on each of the parameters, Qdc (upper-left panel), 5<; (upper-right), wo (lower-left) 
and Wa (lower-right), marginalized over other parameters (9 parameters in total), as a function of mass thresholds used in the 
cluster counts. We assume 10 redshift bins for the cluster counts over redshifts 0.05 < z < 1.0, and 3 redshift bins for the lensing 
power spectrum tomography assuming the redshift distribution of galaxies described in § IIV Al for a survey of 5000 deg^ area. 
In the upper panel of each plot, the solid curve shows the errors expected from a combined measurement of the cluster counts 
and the lensing tomography when the cross-covariance between the two observables are correctly taken into account, while 
the dot-dashed curves shows the errors for the lensing tomography alone. The dashed curve shows the error from combining 
cluster counts and lensing tomography when the cross-covariance is ignored. Note that all the results shown here assume the 
Planck priors discussed in § IIV CI The lower panel of each plot shows the percentage difference in the parameter errors with 
and without the cross-covariance, highlighting the impact of the cross-covariance on the parameter forecasts. The errors on the 
dark energy equation of state parameters, wo and Wa, are improved by an inclusion of the cross-covariance over mass thresholds 
we have considered, but the effect is small (only a few %). 



the information would be restored and we may also use 
the shape of the mass function to constrain cosmological 
parameters (e.g. [9(|). 

The impact of the cross-covariance between the two ob- 
servables on the parameter forecasts can be found from 
comparison of the solid and dashed curves: the dashed 



curve shows the error obtained when the two observables 
are considered to be independent. Further, the lower 
panel of each plot explicitly presents the percentage dif- 
ference in the errors. The impact of the cross-covariance 
on the parameter errors is generally very small, at only 
a few per cent. 
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FIG. 12: As in Fig. 1111 but this plot explicitly shows projected 68% error ellipses in two-parameter subspace the dark energy 
parameters (Side, ^fo, Wa), for cluster counts of mass cut A/min = 5 x 10^'' Mq. The outermost and middle bold-solid curves 
in each panel are the error ellipses expected when using either alone of the cluster counts or the lensing power spectrum 
tomography in combination with Planck, while the innermost shaded ellipses show the errors for the two observables combined. 
For comparison, the thin-solid curves show the error ellipses obtained when ignoring the cross-covariance; the effect is very 
small (the area is enlarged only by a few %). The cross symbol in each plot shows the fiducial model for the Fisher matrix 
analysis. 



Nevertheless, interestingly, in some cases an inclusion 
of the cross-covariance leads to an improvement in the pa- 
rameter errors; for example, the errors on wq or Wa are 
improved over a range of the mass thresholds we have 
considered. This perhaps counter-intuitive result is in 
part due to working in 9 dimensional parameter space, 
and could also be explained as follows (also see ,48] for re- 
lated discussion). As we have carefully investigated, the 
cross-covariance predicted from a CDM model quantifies 
how the cluster counts and the lensing power spectrum 
amplitude are correlated with each other in redshift and 
multipole space. The positive cross-correlation shown in 
Fig.|3]implies that for a given survey region, if the number 
of clusters probed happens to be higher or lower than the 
ensemble average, the lensing power spectrum will be ex- 
pected to have larger or smaller amplitudes, respectively. 
Therefore, such a correlated offset in the two observables 
makes it difficult to determine their true amplitudes com- 
pared to the case in which the two observables are inde- 
pendent, thereby degrading the errors of parameters that 
are primarily sensitive to the amplitudes of the two ob- 
servables. This explains the degradation in the errors on 
ride and (5^ for some range of mass thresholds. On the 
other hand, the correlated offset rather preserves relative 
values between the cluster counts and the lensing spec- 
trum amplitudes in redshift and multipole space. That 
is, a priori knowledge on the cross-covariance leads to an 
improvement in the errors on parameters that imprint 
characteristic redshift and multipole dependences onto 
the cluster counts and the lensing power spectrum. This 



is the case for the parameters wq and Wa- 

In Fig. [T3 we show the projected 68% C.L. error el- 
lipses in the dark energy parameter space, for one partic- 
ular example mass threshold, M^in = 5 x lO^Mg. The 
error ellipses in a two-parameter subspace highlight how 
the two parameters considered are degenerate for a given 
observable and the degeneracies are broken by combin- 
ing different observables. It can be seen that the lensing 
power spectrum tomography and the cluster counts have 
similar degeneracy directions in constraining the dark en- 
ergy parameters. Adding the cluster counts only slightly 
improves the parameter errors compared to the errors 
from the lensing tomography alone. The plot also shows 
that the cross-covariance has a negligible effect on the 
error ellipses. 

Fig. [T3] shows the results for the lensing-based clus- 
ter counts, as a function of the cluster lensing signal, 
where clusters having a lensing signal greater than a given 
threshold, (<S'/iV)ciuster, are included in the counts. As in 
Fig-im we consider 10 redshift bins for the cluster counts 
over redshifts 0.05 < z < 1 and 3 redshift bins for the 
lensing tomography. Note that the plotting range of y- 
axis in the upper panel of each plot is same as that in 
Fig. 1111 while the plotting range in the lower panel is 
different. 

First of all, it should be noted that adding the lensing- 
based cluster counts into the lensing tomography does 
tighten the errors on fide, wq and Wa significantly even 
though the cluster counts include fewer clusters than 
the mass-selected counts (see Fig. For the threshold 
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FIG. 13; Similar to the previous plot, but for the lensing-based cluster counts as a function of the detection thresholds, where 
clusters having the lensing signal greater than the threshold are included in the counts. Note that the plotting range of y-axis 
in the upper panel of each plot is same as that in Fig. [11] A similar improvement in the parameter error is obtained by adding 
the cluster counts, even though the lensing based cluster counts generally contain fewer clusters than the mass-selected cluster 
counts as shown in Fig. [2] The inclusion of the covariance between cluster counts and lensing tomography is slightly larger 
here, compared to when a mass threshold is used for the cluster counts. 



(S'/A^)ciustcr ^ 10, which includes clusters with masses 
M ^ IO^^Mq and mainly covers a narrow redshift range 
of 0.05 ^ z ^ 0.6, the cluster counts still improve the 
dark energy parameters by ~ 25%, in contrast with only 
~ 4% improvement for the mass-selected cluster counts 
with M > IQ^^Mq in Fig. [Ill We find the same percent- 
age improvement when the CMB priors are not included. 
With the reasonable value of (S'/Af)ciustor ^ 6 the un- 
certainties are halved by adding cluster counts to lens- 
ing power spectra. The relatively amplified sensitivity to 
dark energy is attributed to the fact that the cluster lens- 
ing signal itself depends on the dark energy parameters 



via the lensing efficiency, even for a fixed halo mass (see 

Eq. m). 

For the primordial curvature perturbation (5^, adding 
the cluster counts does not improve the error much, com- 
pared to the result in Fig. [TlJ The parameter does 
not affect the amount of lensing for a given cluster so the 
poorer accuracy just arises from larger statistical errors 
due to the smaller number of clusters, compared to the 
mass-selected counts. 

As shown in the lower panel of each plot, the cross- 
covariance has more influence on the parameter forecasts, 
compared to Fig. [TlJ This is because lensing-based clus- 
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FIG. 14: As in Fig. 1121 but for the lensing-based cluster counts of detection threshold (5'/A)ciustor = 6. 



ter counts and lensing tomography both pick up halos 
over a very similar range in redshifts. Therefore there 
are more significant cross-correlations between the two 
observables. However, the effect of the cross-covariances 
on the parameter errors is small, by less than ^ 10%, for 

(5/Af)e,ustcr ^ 6. 

Fig. [14] shows the marginalized error ellipses, for the 
lensing-signal detection threshold (S'/A^)ciustcr = 6. The 
degeneracy directions in dark energy parameter con- 
straints are very similar between the cluster counts and 
the lensing tomography. Compared to Fig. [T^l the 
lensing-based cluster counts have a better accuracy of 
constraining the dark energy parameters, thereby leading 
the error ellipses to more shrink when the cluster counts 
and the lensing tomography combined. This can be ex- 
plained because the contours in the dark energy equation 
of state parameter space is just a projection of the full 9d 
space. We have investigated eigendirections in cosmolog- 
ical parameter space and identified differences between 
cluster counts and lensing for and ilmh^ (effectively 
h given that flm is a parameter in the Fisher matrix). 
For example, we find that if we plot wq against 6(; then 
we see that the cluster counts plus CMB contours are 
more aligned with the wq axis whereas the lensing plus 
CMB contours are more aligned with the Sq axis. When 
combined together, both degeneracies are broken and the 
error bar on wq is reduced. Similarly for (5^ and Wa and 
flmh^ with each dark energy parameter. This explains 
why the combined contours (cluster counts plus lensing 
plus CMB) are smaller than either separate contour (clus- 
ter counts plus CMB or lensing plus CMB), even though 
the separate constraints have the same degeneracy direc- 
tions when projected down onto wq versus Wa space. 

TableUsummarizes the results shown in Figs.fT^andfTil 
showing the marginalized uncertainties for determination 



of the 4 parameters fide, S(^,Wpiv and Wa, where Mmin = 
5 X 10^** M0 and (S'/A^)ciustcr — 6 are employed for the 
mass-selected cluster counts and the lensing-based cluster 
counts, respectively. The error on Wpiv, cr(u'piv), shows 
the error in the dark energy equation of state at the best 
constrained redshift for given observables, or equivalently 
the error on the constant equation of state parameter wq 
obtained by fixing Wa- Note that the pivot redshift Zpivot 
is similar for all the cases: Zpivot ~ 0.5. The numerical 
value (T(wpiv) X aiwa) is proportional to the area of error 
ellipses in the right panels of Figs. [12] and [14] For the 
mass selected clusters there is a mild improvement in 
both the error on Wpiv and Wa when adding cluster counts 
to lensing tomography. For the lensing selected clusters 
case, the improvement is mostly in Wa- 



E. Discussion of systematic errors 

We have considered idealized cases: we have ignored 
possible systematic errors involved in both the cluster 
counts and the lensing power spectrum measurement for 
simplicity. In this subsection, we present some discussion 
of possible effects of the ignored systematic errors on our 
results. 

An imperfect knowledge of galaxy redshifts inferred 
from multi-band imaging data (photometric redshifts 
hereafter simply photo-z) could affect both measure- 
ments of cosmic shear and cluster counts. For cosmic 
shear, statistical errors in photo-zs are unlikely to be a 
dominant source of the error budget of cosmic shear mea- 
surement, if they are well characterized [Sll, [s^. This is 
because gravitational lensing has a broad redshift sensi- 
tivity function. As carefully investigated in [HI, [s^, the 
dominant source of the systematic error is rather caused 
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Probes 


(7(nde) 


a{ln5(;) 






0"(TOpiv) X a{Wa) 


WLT 


0.014 


0.013 


0.039 


0.47 


0.019 


CCM (Mmi„ = 5 X IO'-'Mq) 


0.028 


0.015 


0.085 


0.95 


0.081 


WLT+CCM 


[0.013] 


[0.0096] 


[0.033] 


[0.44] 


[0.0143] 


WLT+CCM (with cross-cov.) 


0.013 (0%) 


0.0093 (3%) 


0.032 (3%) 


0.42 (3%) 


0.0135(6%) 


CCWL ((S'/iV)cluBtor > 6) 


0.026 


0.015 


0.061 


0.89 


0.054 


WLT+CCWL 


[0.0076] 


[0.012] 


[0.038] 


[0.26] 


[0.00993] 


WLT+CCWL (with cross-cov.) 


0.0067 (12%) 


0.012 (0%) 


0.038 (0%) 


0.25 (4%) 


0.00958(4%) 



TABLE I: Expected marginalized errors (68% C.L.) for weak lensing tomography (WLT), the mass-selected cluster counts 
of mass threshold Mmin = 5 x lO^^M© (CCM) and the lensing-based cluster counts of detection threshold (S/Af)ciuster = 6 
(CCWL), where all the probes are combined with Planck. Here the error o"(uipiv) shows the error in the equation of state at the 
best constrained redshift for a given observable. The row labeled as, e.g. 'WLT-I-CCM', shows the parameter errors expected 
when combining, e.g. 'WLT' and 'CCM'. The numerical values in brackets show the errors obtained when ignoring the cross- 
covariance between the cluster counts and the lensing tomography, while the percentage in parenthesis indicates improvement 
in the errors by including the cross-covariance. 



by mean bias in photo-zs, causing mean redshifts of the 
tomographic bins to be shifted relative to the true mean 
redshifts. For planned future surveys, the mean redshifts 
need to be known to better than a few tenths of a per- 
cent accuracy in redshift in order to avoid much degra- 
dation in cosmological parameter errors. To achieve this 
requirement, a large representative spectroscopic redshift 
sub-sample of the full set of galaxies used for lensing may 
be needed to calibrate/correct photo-z errors. 

For cluster counts, photo-z errors cause uncertainties 
in redshift estimates of individual clusters and in addi- 
tion cause uncertainties in the lensing signal of individual 
clusters, if a lensing-based cluster catalog is used. For 
the lensing signal, the requirements on photo-z accuracy 
would be similar to that for cosmic shear, as discussed 
in the previous paragraph. To estimate the redshift of 
a cluster using photo-zs, we often have old red-sequence 
galaxies for which good photo-zs are easier to obtain due 
to a strong 4,000A break (e.g., [Hj). Further, the red- 
shift of the cluster is an average over the redshifts of 
the cluster members, thus reducing the uncertainty yet 
further. In addition, since clusters are relatively rare ob- 
jects it would not be very expensive to perform follow-up 
spectroscopy on a central bright galaxy or some member 
galaxies. These high-quality redshifts would allow much 
finer redshift binning of the cluster distribution than red- 
shift bins of the cosmic shear tomography. Then, tak- 
ing the cross-correlation between the clusters with known 
redshifts and a fair sub-sample of the galaxies used for 
the cosmic shear tomography may be used to calibrate 
the photo-z errors, because the cross-correlation is non- 
vanishing only if the source galaxies are physically associ- 
ated with the clusters (see [91'] for the related discussion) . 
This issue would be worth exploring further, and will be 
presented elsewhere. 

Intrinsic alignments of galaxy ellipticities are another 
potential source of systematic errors for cosmic shear 
measurement (see (92| for the detail and references 
therein). There are two kinds of the contamination. 



The first is intrinsic-intrinsic galaxy alignment (II) that 
may arise from neighboring galaxies residing in a similar 
tidal field of large-scale structure [9^. The second ef- 
fect is a cross-correlation between intrinsic ellipticity of a 
foreground galaxy and lensing distortion of background 
galaxy shape (GI) because the foreground tidal field af- 
fecting the intrinsic ellipticity of a foreground galaxy may 
also cause lensing shear of a distant, background galaxy 
([g^jV In general member galaxies of a cluster tend to be 
more elliptical and therefore the width of the ellipticity 
histogram (intrinsic ellipticity dispersion) will be smaller 
for cluster members due to the absence of e.g. edge-on 
spirals. Therefore if there are by chance more clusters 
in a surveyed area then the noise on the shear power 
spectrum will be slightly smaller. This is likely a tiny 
effect and can be safely neglected. Another interesting 
possibility is that if there are many clusters in a surveyed 
area then both the II and GI contamination to the cos- 
mic shear power spectra would be larger, since member 
galaxies of a cluster are more aligned with each other. 
Also the stronger tidal field due to the cluster may also 
cause the cluster members to be more anti-aligned with 
background galaxy shapes due to lensing distortion by 
the cluster. However, identifying clusters within a given 
survey region could be a useful way to remove/correct 
this II and GI contamination, which is another interest- 
ing possibility of the combined cluster counts and cosmic 
shear to use for future surveys. 

For cluster counts, a most problematic source of the 
systematic errors is the uncertainty in relating cluster 
observables to halo mass. One traditional way to tackle 
this obstacle is to investigate properties of known massive 
clusters in great detail combing various techniques (radio, 
optical, cluster lensing. X-ray and the SZ effect). Or one 
might develop a reliable model for the mass-observable 
relation using hydrodynamical simulations of cluster for- 
mation fully taking into account the associated physical 
processes in the intracluster medium. Then the mass- 
observable relation obtained in these ways could be used 
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for cluster counting statistics if the derived relation is a 
fair representation of the mass-observable distribution of 
clusters in the sample. 

For lensing-based cluster counts, projection effects on a 
cluster lensing signal due to mass along line-of-sight that 
is not associated with the cluster introduce additional 
statistical errors in the mass estimates of individual clus- 
ters [13, HH]. In addition, the scatter will be correlated 
with the cosmic shear power spectrum, which we have 
also not taken into account. To estimate the mass es- 
timate uncertainty and the effect of the ignored corre- 
lation in a quantitative way, ray-tracing simulations of 
cosmic shear including cluster lensing contributions will 
be needed. Also in practice traditional methods (optical. 
X-ray, the SZ effect) will need to be combined to exclude 
false clusters from the sample. These issues are beyond 
the scope of this paper. 

One may develop a model to describe the mass- 
observable relation in terms of nuisance parameters. 
Then we could use cluster observables available from 
a given survey to 'self-calibrate', i.e. determine both 
the cosmological parameters and the nuisance parame- 
ters concurrently. In particular, it was shown in [2^ [23| 
that adding the two-point correlation function of clusters 
to cluster counts, both of which are drawn from the same 
survey region, can be a useful way to self-calibrate the 
model systematic errors in the mass-observable relation 
because the amplitude of the cluster two-point function 
is very sensitive to halo bias that is fairly well specified 
by halo masses. 

Having the discussion above in mind, it would be in- 
teresting to address whether the self-calibration regime 
could be attained for the combined measurements of clus- 
ter counts and cosmic shear tomography, taking into ac- 
count the effects of systematic errors involved in each 
observable. The cluster counts and cosmic shear depend 
on the cosmological parameters in different ways and are 
sensitive to different systematic errors. Hence one can 
use the combined measurement to constrain simultane- 
ously the cosmological parameters as well as the nuisance 
parameters of systematic errors, mitigating degradation 
in the cosmological parameter determination due to the 
systematic errors. Also importantly one could realize, 
for a given survey, the requirements on the control of the 
systematic errors (photo-z, mass-observable relation etc) 
to attain the desired accuracy of constraining dark en- 
ergy parameters. In this direction, the cross-covariances 
between the cluster counts and cosmic shear tomogra- 
phy may play an intriguing role, because (1) the cross- 
correlations are cosmological signals arising from the cos- 
mic mass density field in large-scale structures or, in 
other words, there is in general little cross-correlation be- 
tween the systematic errors in the two observables, and 
(2) a CDM structure formation model provides accurate 
predictions for the cosmological cross-covariances. Hence 
including the cross-covariances in the parameter estima- 
tions may be used as another viable monitor of the sys- 
tematic errors. This interesting issue is beyond the scope 



of this paper and will be presented elsewhere. 



V. CONCLUSION AND DISCUSSION 

In this paper we have estimated accuracies on cosmo- 
logical parameters derivable from a joint experiment of 
cluster counts and cosmic shear power spectrum tomog- 
raphy when the two are drawn from the same survey 
region. In doing this we have properly taken into ac- 
count the cross-covariance between the two observables, 
which describes how the two observables are correlated 
in redshift and multipole space. This is necessary be- 
cause the two experiments probe the same cosmic density 
fields. However note that, since we have ignored possi- 
ble systematic errors, all the results shown in this paper 
demonstrate pure cosmological powers for the combined 
method. We will below summarize our findings, and then 
will discuss the remaining issues. 

We have developed a formulation to compute the cross- 
covariance between the cluster counts and the cosmic 
shear power spectra based on the dark matter halo ap- 
proach within the framework of a CDM structure forma- 
tion model (see Appendix). The cross-covariance arises 
from the three-point correlation function between the 
cluster distribution and two points of the mass density 
fields. It is found that there is a significant positive cross- 
correlation between the cluster counts probing clusters 
with masses M ^ 10^'* M© and the lensing power spec- 
trum amplitudes at multipoles / ^ 10'^. Here the term 
'positive' is used to mean that if fewer or more massive 
clusters are found from a given survey region than the 
ensemble average, the lensing power spectra will most 
likely have smaller or larger amplitudes, respectively. 
The cross-correlation on angular and mass scales of in- 
terest arises mainly from the I-halo term contribution of 
the three-point correlations: the correlation between one 
point within a given cluster and the shearing effects on 
two different background galaxies due to the same clus- 
ter. Our results are more accurate than the earlier work 
presented in [495, because their work ignored the 1-halo 
term contribution to the cross-covariance and only in- 
cluded the 2-halo term contribution, which is dominant 
only on large angular scales where the useful cosmological 
information can not be extracted. 

To quantify the impact of the cross-covariance, we first 
investigated the total signal-to-noise (S/N) ratios for a 
joint measurement of the cluster counts and the lensing 
power spectrum. It was shown that an inclusion of the 
cross-covariance leads to degradation and, depending on 
the mass thresholds or the lensing detection thresholds, 
improvement in the S/N ratios up to ^ ±20% compared 
to the case that the two observable are considered to 
be independent (see Figs. [71 and [TII)) . The improvement 
occurs when the cluster counts including massive halos 
M }t IO^^Mq are combined with the lensing power spec- 
trum measurement (also see [H, [s^ for the related dis- 
cussion). This occurs even though the S/N ratio for the 
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cluster counts alone is much less than that for the lens- 
ing power spectrum alone. That is, a knowledge of the 
number of such massive clusters for a given survey region 
helps improve accuracies of the joint measurement. This 
improvement is achievable only if the cross-covariance is 
a priori known by using the theoretical predictions or by 
directly estimating the cross-correlation from the survey 
region. We also note that the results change greatly if we 
ignore the non-Gaussian error contribution to the lensing 
power spectrum covariance, which arises from the lens- 
ing trispectrum (see Fig. [5]). This im plie s that the lensing 
fields are highly non-Gaussian (see [48[ for an extensive 
discussion) . 

We then presented forecasts for accuracies of the cos- 
mological parameter determination for the joint experi- 
ment. To do this we included redshift binning for both 
the cluster counts and the lensing power spectrum, mo- 
tivated by the fact that the additional redshift informa- 
tion is very useful to tighten the cosmological parameter 
constraints, especially the dark energy parameters. In 
this paper we considered two simplified cluster selection 
criteria: one is a mass-selected cluster sample, and the 
other is the lensing-based cluster sample, where the latter 
contains clusters having the lensing signal greater than 
a given threshold in the sample. For the mass-selected 
cluster counts, it was found that combining the cluster 
counts and the lensing tomography leads to significant 
improvement in the errors on the dark energy parame- 
ters by ~ 40% only if the cluster counts including down 
to less massive halos such as M ^ lO^^M© are consid- 
ered (see Fig. Ilip . The improvement is due to different 
dependence of the two observables on the cosmological 
parameters. 

On the other hand, for the lensing-based cluster 
counts, adding the cluster counts to the lensing power 
spectrum tomography is more complementary to tighten 
the errors on the dark energy parameters than the mass- 
selected cluster counts (see Fig. [T5)) . For example, 
adding the counts of clusters with the high lensing sig- 
nals (S'/A^)ciustcr ^ 6 improves the dark energy errors by 
a factor of 2, even though the counts contain many fewer 
clusters and probe a narrower redshift range than the 
mass-selected clusters of M >, 5 x (see Fig. [2|). 

This result is encouraging because such massive halos are 
rare and therefore it seems relatively easy to make follow- 
up observations, e.g., in order to obtain well-calibrated 
relations between cluster mass and observables (also see 
i^ lIVEI for the discussion). The reason lensing-based clus- 
ter counts are more powerful is ascribed to the fact that 
the cluster lensing signal itself depends on the cosmo- 
logical parameters via the lensing efficiency and the de- 
pendence amplifies the sensitivity of the cluster counts 
to the dark energy parameters. However, with low de- 
tection thresholds such as (5'/A^) duster ^ 3 the lensing- 
based counts begin to suffer too much from projection 
effects due to large-scale structures that are not associ- 
ated with the cluster. Hence, if traditional mass-selected 
cluster counts can go to lower masses then they might 



catch up with, or overtake, the lensing-based counts in 
their constraining power. 

For the impact of the cross-covariance on the param- 
eter determination, the effect is generally small for both 
the mass-selected and lensing-based cluster counts. This 
is partly because the lensing power spectra are sensi- 
tive to the total number of clusters roughly weighted by 
the cluster mass squared whereas for the cluster counts 
we simply added up the number of clusters (see Ap- 
pendix [U]). This means that the two probes are not mea- 
suring such a similar quantity and the cross-covariance 
is smaller than if they both measured the unweighted to- 
tal number of clusters. Further, the redshift weighting 
is different for the lensing power spectra and the clus- 
ter counts, so not all the halos are in common. It is 
also partly a result of working in multi-dimensional pa- 
rameter space (9 parameters for our case). Yet, it is 
intriguing to note that the dark energy parameters are 
in most cases improved by including the cross-covariance 
(see the lower panels of each plot in Figs.fTT]andll3[ also 
see [13 for the related discussion). In summary, a joint 
experiment of cluster counts and lensing power spectrum 
tomography will be worth exploring in order to exploit 
full information on the cosmological parameters from fu- 
ture massive surveys, and including the cross-covariance 
will be needed in order to correctly estimate the error 
bars. 

In this work we have assumed that cluster counts mea- 
sure the total number of clusters above some threshold, 
in a number of redshift bins. In principle subdividing 
cluster counts in mass or lensing signal bins could also 
improve cosmological parameter constraints [2^, [o^ . 
This would make the improvement on including cluster 
counts to lensing power spectra even more impressive, 
however the covariance may be more important than we 
find in this paper. 

Finally, we comment on a possibility for ultimate ex- 
periments combining all observables available from one 
survey region. As we have shown, one can combine differ- 
ent observables to improve accuracies of the cosmological 
parameter determination, even though the observables 
probe the same cosmic density fields. Besides the cluster 
counts and the lensing power spectrum considered in this 
paper, there will be other various observables available: 
cosmic shear bispectra or more generally n-point corre- 
lation functions of the cosmic shear fields [39], n-point 
correlation functions of cluster and galaxy distributions 
[9^, small-scale cluster lensing signals [9q . cosmic flex- 
ion correlation functions [97l . l98j and so on. Then, one 
natural question raises: Can we combine all the observ- 
ables in order to improve the parameter constraints as 
much as possible? Or, in the presence of the systematic 
errors, is there an optimal combination of the observ- 
ables to maximize the parameter constraints as well as 
most mitigate degradation in the parameter constraints 
due the systematic errors. However, to address this in- 
teresting issue quantitatively, all the covariances between 
the observables used have to be correctly taken into ac- 
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count. We believe that the formulation developed in this 
paper would be useful to compute the covariances for any 
observables and the combinations. This kinds of study 
will be worthwhile exploring in order to exploit the full 
potential of future expensive surveys for constraining the 
nature of mysterious dark energy components and possi- 
ble modifications of gravity. 
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APPENDIX 

In this Appendix, we describe a detailed formulation to compute covariances between measurements of cluster 
counts, of lensing power spectra and for the joint experiment. These are needed to quantify the measurement errors 
and the error correlations between different redshift and multipole bins for given survey parameters and cosmological 
models. The covariances are specifically predictable using a secure model of non-linear gravitational clu steri n g in 
structure formation. To do this, we use the halo model approach developed in [5l,[5§| (also see [stIIssI. lool . llOOj j: |60j 
for a thorough review of the halo model). 



APPENDIX A: HALO MODEL APPROACH 



1. A modeling of mass and cluster distributions 

In the halo model approach, we assume that all the matter is in halos with density profile ph{x;m) that is 
parametrized by a mass m (e.g., the virial mass). In this setting, the mass density field at an arbitrary spatial 
position X, p{x), is written as 

= = /«=x'fe(,„-™.Mi(x'-x.)™.„.(x-x'), (Al) 

i i i 
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where we have introduced the normalized halo density profile Um{x) through ph{x; m) = mumix), the summation 
runs over halos (the index i denotes the i-th halo) and <5_D(a;) is the Dirac delta function. The first equality just says 
that the mass density field at the position x is expressed as a superposition of density profiles of all halos existing in 
the universe, where the vector between the position x and the halo position is given hy x ~ Xi with Xi being the z-th 
halo's center. 

The ensemble average over realisations of the universe of the mass density field is shown to be 

(p(.)) . jdm mnim) Jd^x' - . J dm mnim) = p. (A2) 

where we have assumed the ensemble average {J^i ^oi'm — mi)S^{x — a;;)) — n{m) so that the ensemble average does 
not depend on any specific spatial position. Eq. (|A2[) thus demonstrates that the ensemble average of the mass density 
field is equal to the cosmic average mass density pm as expected. Note that the mass function n{m) given in Press & 
Schechter [63 | or the improved one in 61] by definition satisfy the normalization condition Jdm n(m){m/ p^^) = 1. To 
properly define the halo mass m for an extended halo profile the the normalization condition Jd^x Um{x) — 1 must 
be satisfied. In this paper we assume the density is zero outside the virial radius (see [56j for a detailed discussion 
about which mass definitions are self-consistent in the halo model approach). 

In this paper we consider constraints on cosmology calculated from number counts of clusters in a hypothetical 
cluster experiment (see § IIIB[) . In number counts the cluster distribution is treated as points, and in other words 
one does not care about the shape of the mass distribution within a cluster. The relevant quantity is the number 
density field of clusters, which can be straightforwardly modeled based on the halo model formulation, from a slight 
modification of Eq. (jAl[) : 



(a;) = ^ 5^]j{x - Xi)S{m{) Jdm S{m)SD{m ~ m.i) Jd^x'S^{x' ~ Xi)5^{x ~ x') (A3) 



where the subscript 'cl' in Ud stands for cluster and S{m) denotes the selection function that discriminates clusters 
used in the number counts from other halos. The ensemble average of the cluster number density field (|A3p is found, 
similarly to in Eq. (|A2p . to be given by 



nd = (nci(a;)) — Jdm n(m)S{m) ^' ^oi^ ^ = Jdm n{m)S{m). (A4) 

Thus, as expected, the ensemble average of the number density field is indeed given by mass-integral of the halo mass 
function with a given selection criterion. 



2. Correlation functions of mass and cluster distributions 



In this subsection we use the halo model approach to derive the correlation functions of the cluster distribution 
and the cross-correlation between the cluster distribution and the mass distribution. These are needed to quantify 
covariances of the cluster counts and the cross-covariance between the cluster counts and the lensing power spectrum, 
respectively. 

From Eq. (jASp . the 2-point correlation function of the cluster number density field can be computed as 

nli(,cc{\xi - X2\) = {nc\{xi)nd{x2)) - nly 

= S^{m,)dl,{xi - x,)Sl,{x2 - x,)j + S{m,)S{mj)S^{xi - x,)dl,{x2 - Xj)j 

= Jdm Jd^y S{m)dD{m-mi)Sl,{xi-y)dl,{x2~y)Sl,{y~Xi)j 

+ Jdm Jd^y S{m)5D{'m - mi)5]j{xi - y)5%{y - Xi) 

X Jdm! J d^y' 5(m')<5i5(m' - mj)8%{x2 - y')bl{y' - x,) 
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= jdm n{m)S{m) Jd^y Sl){xi ~ y)S%{x2 - y) 



+ \dmn{m)S{m) d ySf){xi - y) dm'n{m')S{m') d y'Sf){x2 - y')(,h{y - y';m,m') 



nc\Sl,{xi - X2) 



dm n{m)S{m)b{m) 



2 



^hx2~xi), (A5) 



where we have used S'^{mi) — S{mi) smce S{mi) = 1 or (see text below Eq. [5]), and the 2-point correlation function 
^cc is dimension-less as usual. The first term o n the r.h.s. represents the 1-halo term contribution that arises due to 
discrete nature of clusters probed (see § 31 in |lOl| ). while the second term gives the 2-halo term contribution that 
arises from clustering of clusters. Note that in the last line on the r.h.s. we have assumed the 2-point correlation 
between different two halos with masses m and m! is given by the linear theory mass correlation function, S,g{r), 
multiphed by the halo bias parameters h{m) and b{m'): ^;i(r;m, m') = h{m)h{m')£^g{r). 

Next we consider the 3-point correlation functions between the cluster number density field and two points on the 
mass density fluctuation field. This 3-point function is needed to quantify the cross-covariance between the cluster 
counts and the lensing power spectrum, where the lensing power spectrum arises from the 2-point correlation of the 
mass fluctuation field. The 3-point correlation function we are interested in is defined as 

nc\C,cSs{xi,X2,X'i) EE {5nc\{xi)5r,i{x2)5jn{X'i)) 

= ^ci [Cc55(a;i, xz, 2:3) + Css{xuX2,xz) + Cl'^s{xi,X2,x:i)\ , (A6) 

where C^css is dimension-less, 5m denotes the mass density fluctuation field, and 5nc\{xi) is the fluctuation part of the 
cluster number density field rid (a;) in Eq. (|A3P (the homogeneous part does not contribute to the 3-point correlation). 
In the second equality on the r.h.s. we divided the 3-point function into three distinct contributions: the 1-, 2- and 3- 
halo terms from the picture of the halo model approach. Note that the 3-point correlation function is given as a function 
of triangle configuration, and the amplitude is invariant under parallel translation and rotational transformation of 
triangle configuration assuming the statistical symmetry. Therefore, the 3-point correlation function is specified by 
three parameters that describe a triangle configuration, e.g. three side lengths. 

Using a similar calculation procedure as used in Eq. (|A5|1 , the 1-halo term contribution to C,cSS can be computed as 

dm n{m)S{m) d^x[ 5]^{xi - x[)—u„iix2 - x[) — u,nix3 - x[) 

J Pm Pm 

= Jdmn{m)(^^^ S{m)um{x2 - Xi)um{x3 - Xi). (A7) 
The 2-halo term contribution C^^^ is found to be 



i,X2,x^) = Jdmin{mi)S{mi)b{mi) Jdm2n{m2)b{m2)(^^^ Jd^yum2{x2 - y)um2{x3 - y)£,s [xi - y) 

f TTl f TTl f 

+ dmi n{mi)b{mi)S{mi)^Umi{x2 - xi) dm2 n{m2)h{m2)^ d^U Um2{x3 - y^f {xi - y) 



+ dmi n{mi)b{mi)S{mi)^UrnAx3 - xi) dm2 n{m2)b{m2)^ d^y Um2{x2 - y)^s (xi ~ y)- (A8) 

J Pm J Pm J 



The 2-halo term arises from the correlation between two different halos, where the clustering strength between the 
two halos is given by b{mi)b{m2)^g the same as we did in Eq. (|A5p . 

The 3-halo term (^^'g in Eq. (jA6|, which arises from the correlation between three different halos, is given by 

nciCcSs{xi,X2,X3) = Jdmi n{mi)S{mi)b{mi) Jdm2 n{m2)b{m2)^ Jd^y u,n2{x2 - u) 

X I dmz n{mz)b{mz)^ d^y' u^d^s - y')(s"^ {xi,y,y'), (A9) 

J Pm J 

where (.f^ is the perturbation theory prediction for the 3-point correlation function of the mass density field, and 
we have assumed that the 3-point correlation of halo distribution can be expressed by (^f"^ multiplied by halo bias 
parameters: Qi{xi,X2,X3;mi,m2,m3) = b{mi)b{m2)b{m3)(f'^ {xi, X2, X3) . 
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For convenience for the following discussion, we derive the Fourier-transformed counterpart of the 3-point correlation 
function CcSSj the bispectrum B^ss- The bispectrum is related to the 3-point correlation function via the Fourier 
transform given by 



(AlO) 



Note that the bispectrum is also defined by the ensemble average of the three Fourier-transformed coefficients of the 
cluster number density field and the mass density fields as 



Similarly to in Eq. (|A6p . the bispectrum can be divided into the 1-, 2- and 3-halo term contributions as 

DcSS — ^cSS + ^cSS + ^cSS- 

Combining Eq. (jA7p and Eq. (jAlOp . the 1-halo term of the bispectrum, B^gg, is found to be 

nc\Blgg{ki,k2,k3) ^ Jdm n{m) S{m)um{k2)um{k3)- 
Here Um{k) is the Fourier transform of the halo density profile defined as 



'inr'^dr u{r)jo{kr) 



(All) 
(A12) 

(A13) 
(AM) 



where we have assumed a spherically symmetric density profile for simplicity, rvir is the virial radius (more generally, 
the boundary radius of a halo used for the mass definition), and jo{x) is the zero-th order spherical Bessel function, 
jo{x) — sinx/x. Note that has the property that Um{k) = 1 for fc ^ 0. From Eq. (|A13p . one finds that the 
bispectrum does not depend on wavenumbcr fci that comes from the Fourier transform of the cluster number density 
contribution; since the cluster distribution is modeled as discrete points, the contribution to the 1-halo term arises 
from representative one point within a given halo (e.g., the halo center), which corresponds to white noise (therefore 
no fc-dependence) in the Fourier transform. 

Similarly, from Eq. (|A8p . the 2-halo term contribution to the bispectrum, B^gg, is given by 



n^iB'^^s{ki,k2,k3) = 



drrii n{mi)S{mi)b{mi) 



dm2 n{m2)h{m2) { ^ 
J \Pm 



{k2)ur,%2{k:i) 



+ 



dmi n{mi)S{mi)b{mi)^^u,ni{k2) 

Pm 

dmi n{mi)S(rai)b{rai)— — Umi{k^) 

Pm 



dm2 n{m2)b{m2) — Um2{k'i) 

Pm 

TTI2 

dm2 n{m2)b{m2)^Um2{k2) 

Pm 



Ptiki) 

Pt{k,) 
P/'(fc2).(A15) 



Here, square brackets are used in order to emphasize that the halo mass integral in the terms enclosed by square 
brackets can be calculated separately from other terms. 

From Eq. (|A9[) . the 3-halo term of the bispectrum, B^^g, is found to be 



nciBi'g'g{ki,k2,k-3 



dmi n{mi)S{mi)b{mi) 



dm2 n{m2)b{m2)^Um2{k2) 

Pm 



dms n{m'i)b{m3) — Urn-Akz) 

Pm 



BP (fcl,fe2,fc3) 



where B^"^ is the perturbation theory prediction for the mass bispectrum given by 



Br(fci,fc2,fe3) 



"10 






(ki-k2\ 


y + 


Vfc2 " 


^1) 


V fclfc2 ) 



P[{ki)P[{k2) + 2peTm., 



(A16) 



(A17) 



where the terms denoted by '2 perm.' are obtained from two permutations of fci ^ k^ and fca fes (e.g., see by |102 
for the derivation of Bf'^). 
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APPENDIX B: 



COVARIANCES OF THE CLUSTER NUMBER COUNTS AND LENSING POWER 

SPECTRUM 



Using the correlation functions shown in the preceding section, we are ready to derive covariances of the cluster 
counts, and the cross-covariance with the lensing power spectrum. 



Covariances of the cluster counts 



In this paper we have considered the average angular number density of clusters drawn from a given survey region 
on the sky as our observable from the cluster count experiments. An estimator of the angular number density is given 
by Eq. ([6]), which is slightly modified from Eq. (jA3|) so that the counts include redshift binning via a modification 
of the selection function to 5(6) (m;z) (the subscript '6' denotes the 5-th redshift bin). The covariance between the 
number densities in redshift bins b and b' is defined by Eq. (|12p and can be computed, using Eq. (|A5p and Limber's 
approximation, as 

[CVw ^ (A/'(fc)A/'(fc')) - 

cfew{e) j£e'w{e') jdx^^ /rfx':^;^:^ [("ci,(6)(x,xe)ncL((,o(x',x'^')> -"ci.w^cub')] 



d^ewio) jd^e'w{e') Jdx-^ Jdx 



dx'dn 

, d^V 
dx'dn 



Sd{x - x')5d{xO - x'S') / dm S(b){m; z)S(i,'){-m] z')n{m) 



+ \ dmi S(-h){mi; z)n(mi)b(mi) \ \ f dm2 S'(f,') (to2; z')n{m2)b{m2) \ £,f{x - x'; z, z') 



jd^ewHe) jdx(^^^ x-^ j dm S(k){m-z)Si^,r^{m,z)n{m) 

+5-,jsewie)jse'wmjdx{^)\- 



dm S (i,){m] z)n{m)b{m) 



d'^l rf I 



K 



"bb' 



+ €b' Idx 



(jPv_y 



dm S(^i,^{m; z)n{m)b{m) 



Idl 
2^ 



Ptik^l/x:x)\W{lQs)[' 



(Bl) 



where Pg is the linear mass power spectrum, iV({,) is the ensemble average of the angular number density estimator 
given by Eq. ([5]), fig is the surveyed area, and W{1) is the Fourier transform of the survey window function (see 
text below Eq. [13] for the details). To be more explicit for the calculation procedures above, in the fourth line on 
the r.h.s., we have employed Limber's approximation for the calculation of the 2-halo term: the multiple line-of-sight 
integral of clustering contributions at different redshifts is replaced with the single line-of-sight integral. This is a good 
approximation when the redshift bin width of the number counts is sufficiently thicker than the correlation length of 
clusters. In addition, since we assume the cluster redshift bins do not overlap the selection functions for two redshift 
bins have the property S(b)S(pr 



"bb"^{b) 



^w^ib)- Therefore the 1- and 2-halo terms are both proportional to 



the Kronecker delta function S^, ensuring that there is no correlation between the cluster counts in different redshift 
bins. In the last equality for the 1-halo term calculation, we have used {d'^V/dxdil.)^x~^ — d^V/dxdil and the integral 
of the survey window function is computed as Jd^O W'^{6) — l/fis because of the normalization condition for the 
window function, j d^0W{6) = 1 (we have assumed a top-hat window function for simplicity). 

The covariance of the cluster counts has two distinct contributions; the first term in Eq. (|Bip represents the shot 
noise contribution arising from the imperfect sampling of fluctuations due to a flnite number of clusters, while the 
second term represents the sampling variance arising from fluctuations of the cluster distribution due to a finite survey 
volume. It should be stressed that, based on the halo model approach, we can thus derive the shot noise contribution 
to the covariance without ad hoc introducing the term as conventionally done in the literature |21]. We would like to 
also emphasize that the sample variance in Eq. (jBl[) is consistent with the results derived in [63|. 



2. The lensing power spectrum covariance 



In this subsection, we derive the covariance of the lensing power spectrum following the formulation developed 
in [4^ . In this paper we have focused on the lensing power spectrum as our lensing observable. Under a flat- 
sky approximation, the power spectrum is constructed from the two-dimensional Fourier transform of the measured 
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convergence field available over a given survey region. The Fourier decomposition has to be done for modes taken 
from a finite survey region. For such a finite sky measurement, infinite number of the Fourier modes are not available. 
Therefore, the Fourier decomposition is by nature discrete, and the fundamental mode is limited by the size of surveyed 
area, Zmin — 27r/0s, where the survey area is given by fig — (we assume a square survey geometry for simplicity) 
For this case, the convergence field can be expanded using the discrete Fourier decomposition as 



I 

where the summation runs over the combination of integers {nx,ny) for I = {27r /Qs){nx,ny). Here we consider the 
convergence field for a single source redshift bin for simplicity, and it is very straightforward to extend the following 
discussion to a tomographic case. For an infinite survey limit (Os — > oo), the Fourier transform above becomes 

If). 

In the discrete Fourier expansion, the orthogonal relation for eigenmode function e is given by 

(fde'^^-^'^-^ = (B4) 

where the integration range for ^<P6 is confined to the survey area and (5^ ^, is the Kronecker type delta function for 
vectors defined as 

5fr = [^'^^^^' ■ (B5) 
10 otherwise 

The orthogonal relation (jB4p implies that the Kronecker delta function 5f_i, should be replaced with the Dirac delta 
function for an infinite survey (Gg — > oo) (also see [ll]) as 

njf i, ^{2nf5l{l-l'). (B6) 

From this relation, the definition for the convergence power spectrum is also modified for the finite-sky Fourier 
decomposition to 

{ki^kij = njf^^iK{h) (B7) 

in that the power spectrum definition matches the conventional one {ki^kj^J = (27r)^(5|)(ii + 12)Pk{W) for an infinite 

survey limit. Note that, from Eqs. (jB2p and (jB4p . the inverse Fourier transform is given by = J^(PO K{9)e^^'^. 

Once the discrete Fourier modes of the convergence field for a finite survey are defined, an estimator for the 
convergence power spectrum measurement may be defined as 

Ideh 

where the summation runs over all the Fourier modes whose length is in the range of Z — 5^/2 < \l\ < I + 61/2 for a 
given bin width SI. Here Np(l) is the number of modes taken for the summation, and is given by Np(l) — igi^ ~ 
2nlSl/{2TT/Qs)^ = 2/(5Z/sky, where /sky is the sky coverage as fig = 9^ = 47r/sky Note that Eq. ([14]) corresponds to 
an integral form of Eq. (jBSp . where the two approximately match each other for large / 3> l/6s- 



Exactly speaking we are not consistent for a treatment of survey geometry, compared to our another assumption of a circular geometry 
for the survey window function used in the cluster counts (see around Eq. 1131 ). However, most information of the lensing power 
spectrum comes from small angular scales, so the geometry does not affect our results as long as the survey area is sufficiently large. In 
addition the lensing covariance depends on the survey area, not on the survey geometry, to a zero-th order approximation. For these 
reasons, we use the approximation for computational simplicity. 
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Using Eq. (jB7p . the ensemble average of the power spectrum estimator (|B8p is found to indeed give the underlying 
true power spectrum Pk(0- 

- ]^^«(0E-^«(0 (B9) 

where we have assumed that the power spectrum changes little within the bin width in the third equality on the r.h.s. 
The covariance between the convergence power spectra in multipole bins I and I' is defined as 



1 



E (^z'^-z^r^-r)-^«(0^«(O- (bio) 



Thus an estimation of the power spectrum covariance requires a knowledge on the 4-point correlation functions of the 
convergence field. The 4-point correlation function generally has two contributions; one is the Gaussian contribution 
given by the power spectrum, and the other is the non-Gaussian contribution that is the connected part of the 4- 
point function, the so-called trispectrum. The trispectrum of the convergence field is naturally induced by non-linear 
evolution of gravitational clustering in structure formation, which carries additional information beyond that of the 
2-point functions. For a finite-sky Fourier decomposition, assuming the statistically isotropic, random field for kj^, the 
4-point function in Eq. (jBlOp can be expressed in terms of the power spectrum and the trispectrum as 

- nlp^{i)p^{i') + ^lp^{i)5f_^i,p^{i')5f_^i, + nlp^{i)5f_i,p^{i')5f_i, + n^T^n, -i, i', (bii) 

Inserting this equation into Eq. (jBlOp gives 
1 1 



E E ^- 



" E + N,ii)N,ii')n^ E E/'^(^'-M',-n 

^ whP"^^^ + Tl- f 4/-,T.{l-ll-h (B12) 

where A{1) = (Pi ~ 2tt161. In the third equality for the first term calculation, we have replaced the Kronecker- 

type delta function for vectors, Sj^i,, with the delta function for scalar, S^,, because the first term is non- vanishing 
only if two multipoles I and I' are same to within the bin widths. In the fifth line, we have used the integral form for the 
second term rather than the summation form for notational simplicity. The first term of the covariance represents the 
Gaussian errors where the power spectrum of different multipoles are independent, while the second term represents 
the non-Gaussian errors to describe correlations between the power spectra in different multipole bins. Extending 
Eq. (|B12p to the tomographic case for source redshift distribution gives Eq. (fTS]) in main text. The equation (|B12p is 
equivalent to the expression used in [46l| when /, /' ^ 1. 



3. Cross-covariance between the cluster number counts and the lensing power spectrum 



We can now derive the cross-covariance between the cluster counts and the cosmic shear power spectrum. For 
illustrative purposes, we consider a single redshift bin for both the lensing power spectrum measurement and the 
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cluster counts. From Eqs. ([6]) and (jBSp . the cross-covariance is defined as 

^ (<5iVci(e)pr(o) 



1-^ j <few{e) ^ Y.^ki~n_i5Ni>y^'-^. (bis) 



i\i&ib I' 



In the second line on the r.h.s., we introduced the angular number density fluctuation field of the cluster counts, 
5Nc\{9)^ for convenience for the following discussion. Please do not confuse 5Nc\{9) with the ensemble average of the 
average angular number density, TV. The angular number density fluctuation field can be expressed in terms of the 
three-dimensional number density fluctuation field of clusters defined in Eq. (|A3[) : 



5N,m^ ydx|^*ici(x,X0). (B14) 

In the fourth line on the r.h.s. of Eq. (|B13[) . we used the Fourier transform of 5Nc\ using the discrete Fourier 
decomposition for a finite sky survey as discussed around Eq. (|B2p : 



5N,,{e) = ^Y.^Ni,e^'-^. (B15) 

' r 

As performed in Eq. (IB7p , the ensemble average of products of the cluster number density fluctuation field and the 
two convergence fields, appearing in the last line on the r.h.s. of Eq. (jB13p . can be expressed in terms of the angular 
bispectrum defined as 

{~Ki~K_idNi>) = aSc-wi(«', I, -I, )5f- (B16) 
Substituting this equation into Eq. (|B13|) allows further simplification of Eq. (|B13[) as 

Cov[AA,P-*(0] = f^jT^i) /'^'^^W E 5c-wi(r = 0,i,-Z) 

l\ieib 

« -^Bc-wi(;'- 0,^,0, (B17) 

where we have used J(P9W{9) = 1 and assumed, in the third line on the r.h.s., that the bispectrum Bc-w\{l' = 0, 1, —I) 
changes little within the multipole bin width. 

Employing Limber's approximation, the angular bispectrum (jB16l) can be expressed in terms of the 3D bispectrum 
defined by Eq. (|XT2l) as 

B,^^{hMM) = jdx ^^W^{x)^n,iix)B,ss{ki ^ li/x,k2 ^ 12/X,k3 ^ h/x)- (B18) 

Inserting this equation into Eq. (jB17|) gives the final expression for the cross-covariance between the cluster counts 
and the lensing power spectrum: 

Cov[AA,Pr(0] = 7^ [dx ^W^ix)^nciix)BcSsiki = 0, fcs - Vx, ^3 = l/x)- (B19) 
Vis J dxdil ^ X 

Note that the cross-covariance scales with the sky coverage as l//sky Further, it would be instructive to explicitly 
show each of the 1-, 2- and 3-halo term contributions to the covariance. Inserting Eq. (|A13P into Eq. (|B19|1 . the 1-halo 
term contribution to the covariance can be expressed as 

Cov[AA,Pr(0]'' - ^ Jdx Jdm n{m) {^^^ S{m)u„.ik = l/x)uUk = l/x)- (B20) 
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Similarly, from Eq. (|A15|) . the 2-haIo term contribution is found to be 



dm n{rn)S{rn)b{m)^—Ur,i(k — l/x) 

Pm 



2 



Ptik^l/X^x), (B21) 



where we have used P^{k) — > for fc — > for the linear power spectrum as predicted from the inflation motivated 
primordial power spectrum. It is also interesting to find the 3-halo term contribution to the covariance is vanishing 
as 

Cov[AA,P^^'(/)]3'' = 0, (B22) 
because the perturbation theory bispectrum B^'^{ki, ^2, fca) — > for fci ^ 0. 

APPENDIX C: A TOY MODEL 

In this section we show that the qualitative behavior of the percentage difference in S/N (plotted in the lower panel 
of Figs. [71 and nop can be recovered using a surprisingly simple toy model. Although extremely simple it will help us 
gain intuition for the cause of the increase and decrease in the percentage difference in S/N. 

Imagine a simple universe in which all halos exist only at a small range of redshifts around redshift z, are unclustered 
(P^(fc) — 0), and have a profile which is independent of mass {u'^ik = Vx) ~ ""^(^ = Vx))- Eq. ^ for cluster counts 
may be rewritten as 

D'' = N = X^Sx Jdm S{m)n{m), (CI) 

where we assume that the volume element over a small redshift interval is given by jdx{d'^V / dxdVi) = x^Sx, and 
consider a single redshift bin. Further, from Eg. (|10p and using the halo model expression for the 1-halo term of the 
3D mass power spectrum (e.g. see Eq. [9] in [56[), the lensing power spectrum for our simple universe can be shown 
to be given by 

\ 2 

m 



D3 = P^^W'g{x)x''5xu' jdmn{m)\^—^ . (C2) 

For simplicity we consider observations at a single t using one redshift bin, therefore the lensing power spectrum 
measurement is a single number. 

To reproduce the results in Fig. [7] we also need to calculate the covariances in this simple universe. We will assume 
that there is no galaxy intrinsic ellipticity = Q so that the shot noise term due to intrinsic galaxy shapes is 
negligible. Further we assume that the lensing power spectrum covariance arises only from the 1-halo term of the 
lensing trispectrum in Eq. (jlSp . or in other words ignore the Gaussian error contribution (the first term in Eq. [15|). 
Substituting the 1-halo term of the lensing trispectrum into Eq. (jlSp for this simple universe gives 

= ^W^g{x)x~''5xu^ jdmn{m){^^^ . (C3) 
The cluster count variance is given by the shot noise (see Eq. |13p: 

C"^ = ^X^'^X Jdm S{m)n{m). (C4) 
The cross-covariance between cluster counts and the lensing power spectrum is expected from Eq. (|B19p to be 

C^'^^W^g{x)x-^5xu'' jdmS{m)n{m)l^^^ . (C5) 

Thus the only ingredient is the mass function weighted by various powers of the mass, and by the cluster selection 
function, the lensing efficiency and the volume element. 

The correlation coefficient between the cluster counts and lensing power spectrum is given as before as 
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Minimum halo mass [MJ Minimum halo mass [MJ 



FIG. 15: Left panel: Correlation coefficient r between cluster counts and the lensing power spectrum for the simple toy model 
containing only the halo mass function and mass weighting. Right panel: Percentage difference in S/N (to be compared with 
the lower panel of Fig. [7]) for the simple toy model containing only the halo mass function and mass weighting. 



Note that all the prefactors in Eqs. (jC3[) . (jC4[) and (jC5[) appearing in front of the halo mass integral drop out and 
therefore the results shown in this Appendix are independent of redshift and multipole. This is plotted in the left 
panel of Fig. [15] and may be compared to the lower panel of Fig. [5] for the full treatment. The shape is remarkably 
similar to that for the full treatment, implying that this simple model containing the different weightings of the mass 
function captures the essence of the complementarity. The correlation peaks at around lO^^M© and decreases at low 
minimum cluster masses. It makes sense that the cluster counts and lensing are correlated at high minimum cluster 
masses, because the mass weighting in the toy lensing power spectrum is similar to the mass cut in the cluster counts: 
they are both dominated by high mass clusters. They become less correlated at lower minimum masses because the 
cluster counts are dominated by low mass clusters that contribute less to the lensing power spectrum. 

The size of the correlation is larger for this simple model than for the full treatment. This makes sense because the 
full model contains several ingredients that will reduce the correlation including: the contributions of the Gaussian 
errors and shot noise to the lensing power spectrum covariance (e.g., compare thick and thin lines in the lower panel 
of Fig. [5]); halos at a range of redshifts which will be weighted differently by the lensing power spectra and cluster 
counts; and terms involving more than one halo at a time. However an even simpler toy model, in which all halos 
in the universe have the same mass, would make lensing power spectra and cluster counts 100 per cent correlated. 
We see that simply including the mass weighting cx for the lensing power spectra stops a complete redundancy of 
information and starts to explain how these seemingly similar probes can be so complementary. 

In this simple model we have only one data point from cluster counts D'^ and one data point from lensing (since 
we have only one redshift bin for each, and we are considering a single wavenumber I). Therefore we can write the 
signal-to-noise in terms of the correlation coefficient 

(Sy_ 1 fD9^ D"^ 2D9D^ 
\NJ ^ (1 -r2) \ C9' ^ C5= 

All the terms scale with [VlsX^Sx): the comoving volume of the redshift interval considered. We use a redshift slice at 
z = 0.3 of thickness 0.1 for illustration. 

To reproduce the plot of percentage difference in {S/N) we need to compare this to the (S/N) when the covariance 
is not taken into account, found by setting r = in the above. The factor 1/(1 — r^) is close to unity when the 
correlation coefficient r is small, but when the correlation is strong (r ~ 1) it gets much larger. This causes the 
(S/N) to be larger when the covariance is included than when it is not included and gives rise to the peak in the 
right hand panel of Fig. [151 The final term in the round brackets causes a decrease in S/N, relative to the case where 
no covariance is included. This is important especially when the correlation is small and the factor 1/(1 — r^) is 
unimportant. This explains the dip in right hand panel of Fig. [T5l In summary: the fact that Fig. I15l is qualitatively 
similar to the lower panel of Fig. [7| suggests that the peak and dip can be explained just in terms of the different mass 
weighting of cluster counts and the lensing power spectrum. 




